data_juicer.ops.mapper.query_intent_detection_mapper module¶
- class data_juicer.ops.mapper.query_intent_detection_mapper.QueryIntentDetectionMapper(hf_model: str = 'bespin-global/klue-roberta-small-3i4k-intent-classification', zh_to_en_hf_model: str | None = 'Helsinki-NLP/opus-mt-zh-en', model_params: Dict = {}, zh_to_en_model_params: Dict = {}, *, label_key: str = 'query_intent_label', score_key: str = 'query_intent_label_score', **kwargs)[源代码]¶
基类:
Mapper
Predicts the user's intent label and corresponding score for a given query. The operator uses a Hugging Face model to classify the intent of the input query. If the query is in Chinese, it can optionally be translated to English using another Hugging Face translation model before classification. The predicted intent label and its confidence score are stored in the meta field with the keys 'query_intent_label' and 'query_intent_score', respectively. If these keys already exist in the meta field, the operator will skip processing for those samples.
- __init__(hf_model: str = 'bespin-global/klue-roberta-small-3i4k-intent-classification', zh_to_en_hf_model: str | None = 'Helsinki-NLP/opus-mt-zh-en', model_params: Dict = {}, zh_to_en_model_params: Dict = {}, *, label_key: str = 'query_intent_label', score_key: str = 'query_intent_label_score', **kwargs)[源代码]¶
Initialization method.
- 参数:
hf_model -- Huggingface model ID to predict intent label.
zh_to_en_hf_model -- Translation model from Chinese to English. If not None, translate the query from Chinese to English.
model_params -- model param for hf_model.
zh_to_en_model_params -- model param for zh_to_hf_model.
label_key -- The key name in the meta field to store the output label. It is 'query_intent_label' in default.
score_key -- The key name in the meta field to store the corresponding label score. It is 'query_intent_label_score' in default.
kwargs -- Extra keyword arguments.