data_juicer.ops.mapper.dialog_sentiment_intensity_mapper module

class data_juicer.ops.mapper.dialog_sentiment_intensity_mapper.DialogSentimentIntensityMapper(api_model: str = 'gpt-4o', max_round: Annotated[int, Ge(ge=0)] = 10, *, intensities_key: str = 'dialog_sentiment_intensity', analysis_key: str = 'dialog_sentiment_intensity_analysis', api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, query_template: str | None = None, response_template: str | None = None, analysis_template: str | None = None, intensity_template: str | None = None, analysis_pattern: str | None = None, intensity_pattern: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 3, model_params: Dict = {}, sampling_params: Dict = {}, **kwargs)[源代码]

基类:Mapper

Mapper to predict user's sentiment intensity (from -5 to 5 in default prompt) in dialog. Input from history_key, query_key and response_key. Output lists of intensities and analysis for queries in the dialog.

DEFAULT_SYSTEM_PROMPT = '请判断用户和LLM多轮对话中用户的情绪变化。\n要求:\n- 用户情绪值是-5到5之间到整数,-5表示极度负面,5表示极度正面,-5到5之间数值表示情绪从负面逐渐到正面的变化过程,0代表情呈绪中性。\n- 只输出当轮对话的分析,不要继续构造对话。\n- 需要先进行分析,然后确定用户的情绪值,下面是一个样例,请模仿样例格式输出。\n用户:你好,我对可持续发展的定义有点模糊,帮我解释一下?\n情绪分析:刚开始,还没得到LLM回复,用户情绪呈中性。\n情绪值:0\nLLM:当然可以!可持续发展是指在满足当代人的需求的同时,不损害子孙后代满足其自身需求的能力的发展模式。它包括经济发展、社会发展和环境保护三个主要方面。通过合理利用资源和保护环境,我们可以确保未来的世代也能享有健全的生态系统和经济制度。\n用户:谢谢你的解释!那你能告诉我一些普通人可以采取的可持续生活方式吗?\n情绪分析:对回答感到满意,情绪正面。\n情绪值:1\nLLM:当然可以,普通人可以通过减少一次性产品的使用、选择公共交通或拼车、节约用水、以及支持本地和可持续发展的企业等方式来践行可持续生活。此外,关注垃圾分类和多用电子账单也是不错的选择。\n用户:你提到支持本地企业,这一点我很感兴趣。能详细说说为什么这对可持续发展有促进作用吗?\n情绪分析:觉得回答实用且具体,情绪进一步转好。\n情绪值:2\nLLM:呃,我最近发现了一部新电影,讲述了一个关于外星人和地球土著合作保护环境的故事。虽然它是科幻片,但很有启发性,推荐你去看看。\n用户:什么吗,根本是答非所问。\n情绪分析:LLM没有回应问题而是提到无关内容,导致用户情绪直线下降。\n情绪值:-2\nLLM:抱歉刚才的偏题!支持本地企业有助于减少长途运输产生的碳足迹,使供应链更加环保。此外,本地企业也更有可能采用可持续的生产方式,同时促进社区经济的繁荣。\n用户:还行吧,算你能够掰回来。\n情绪分析:问题得到解答,问题偏题得到纠正,情绪稍有好转。\n情绪值:-1\n'
DEFAULT_QUERY_TEMPLATE = '用户:{query}\n'
DEFAULT_RESPONSE_TEMPLATE = 'LLM:{response}\n'
DEFAULT_ANALYSIS_TEMPLATE = '情绪分析:{analysis}\n'
DEFAULT_INTENSITY_TEMPLATE = '情绪值:{intensity}\n'
DEFAULT_ANALYSIS_PATTERN = '情绪分析:(.*?)\n'
DEFAULT_INTENSITY_PATTERN = '情绪值:(.*?)($|\n)'
__init__(api_model: str = 'gpt-4o', max_round: Annotated[int, Ge(ge=0)] = 10, *, intensities_key: str = 'dialog_sentiment_intensity', analysis_key: str = 'dialog_sentiment_intensity_analysis', api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, query_template: str | None = None, response_template: str | None = None, analysis_template: str | None = None, intensity_template: str | None = None, analysis_pattern: str | None = None, intensity_pattern: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 3, model_params: Dict = {}, sampling_params: Dict = {}, **kwargs)[源代码]

Initialization method.

参数:
  • api_model -- API model name.

  • max_round -- The max num of round in the dialog to build the prompt.

  • intensities_key -- The key name in the meta field to store the output sentiment intensities. It is 'dialog_sentiment_intensity' in default.

  • analysis_key -- The key name in the meta field to store the corresponding analysis. It is 'dialog_sentiment_intensity_analysis' in default.

  • api_endpoint -- URL endpoint for the API.

  • response_path -- Path to extract content from the API response. Defaults to 'choices.0.message.content'.

  • system_prompt -- System prompt for the task.

  • query_template -- Template for query part to build the input prompt.

  • response_template -- Template for response part to build the input prompt.

  • analysis_template -- Template for analysis part to build the input prompt.

  • intensity_template -- Template for intensity part to build the input prompt.

  • analysis_pattern -- Pattern to parse the return sentiment analysis.

  • intensity_pattern -- Pattern to parse the return sentiment intensity.

  • try_num -- The number of retry attempts when there is an API call error or output parsing error.

  • model_params -- Parameters for initializing the API model.

  • sampling_params -- Extra parameters passed to the API call. e.g {'temperature': 0.9, 'top_p': 0.95}

  • kwargs -- Extra keyword arguments.

build_input(history, query)[源代码]
parse_output(response)[源代码]
process_single(sample, rank=None)[源代码]

For sample level, sample --> sample

参数:

sample -- sample to process

返回:

processed sample