data_juicer.ops.mapper.query_sentiment_detection_mapper module¶
- class data_juicer.ops.mapper.query_sentiment_detection_mapper.QuerySentimentDetectionMapper(hf_model: str = 'mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis', zh_to_en_hf_model: str | None = 'Helsinki-NLP/opus-mt-zh-en', model_params: Dict = {}, zh_to_en_model_params: Dict = {}, *, label_key: str = 'query_sentiment_label', score_key: str = 'query_sentiment_label_score', **kwargs)[source]¶
Bases:
Mapper
Predicts user’s sentiment label (‘negative’, ‘neutral’, ‘positive’) in a query.
This mapper takes input from the specified query key and outputs the predicted sentiment label and its corresponding score. The results are stored in the Data-Juicer meta field under ‘query_sentiment_label’ and ‘query_sentiment_label_score’. It uses a Hugging Face model for sentiment detection. If a Chinese-to-English translation model is provided, it first translates the query from Chinese to English before performing sentiment analysis.
- __init__(hf_model: str = 'mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis', zh_to_en_hf_model: str | None = 'Helsinki-NLP/opus-mt-zh-en', model_params: Dict = {}, zh_to_en_model_params: Dict = {}, *, label_key: str = 'query_sentiment_label', score_key: str = 'query_sentiment_label_score', **kwargs)[source]¶
Initialization method.
- Parameters:
hf_model – Huggingface model ID to predict sentiment label.
zh_to_en_hf_model – Translation model from Chinese to English. If not None, translate the query from Chinese to English.
model_params – model param for hf_model.
zh_to_en_model_params – model param for zh_to_hf_model.
label_key – The key name in the meta field to store the output label. It is ‘query_sentiment_label’ in default.
score_key – The key name in the meta field to store the corresponding label score. It is ‘query_sentiment_label_score’ in default.
kwargs – Extra keyword arguments.