data_juicer.ops.mapper.dialog_sentiment_detection_mapper module

class data_juicer.ops.mapper.dialog_sentiment_detection_mapper.DialogSentimentDetectionMapper(api_model: str = 'gpt-4o', sentiment_candidates: List[str] | None = None, max_round: Annotated[int, Ge(ge=0)] = 10, *, labels_key: str = 'dialog_sentiment_labels', analysis_key: str = 'dialog_sentiment_labels_analysis', api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, query_template: str | None = None, response_template: str | None = None, candidate_template: str | None = None, analysis_template: str | None = None, labels_template: str | None = None, analysis_pattern: str | None = None, labels_pattern: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 3, model_params: Dict = {}, sampling_params: Dict = {}, **kwargs)[source]

Bases: Mapper

Mapper to generate user’s sentiment labels in dialog. Input from history_key, query_key and response_key. Output lists of labels and analysis for queries in the dialog.

DEFAULT_SYSTEM_PROMPT = '请判断用户和LLM多轮对话中用户所具有的情绪。\n要求:\n- 需要先进行分析,然后罗列用户所具有的情绪,下面是一个样例,请模仿样例格式输出。\n用户:最近工作压力好大,我觉得整个人都快被压垮了。\n情感分析:用户的言语中透露出明显的压力和疲惫感,可能还夹杂着一些无助和焦虑。\n情感类别:压力、疲惫、无助、焦虑\nLLM:听起来你真的承受了很多,面临这种情况确实不容易。有没有考虑过找一些放松的方式,比如听音乐或者散步来减轻压力呢?\n用户:试过了,但是好像没什么效果,每天的事情都堆积如山。\n情感分析:用户感到无力解决现状,有挫败感,并且对尝试放松的方式失去信心。\n情感类别:无力、挫败\nLLM:我理解你的感受,有时候压力积累到一定程度确实让人难以承受。或许你可以尝试规划一下时间,把任务分成小块来完成,这样可能会减少一些压力感。\n用户:这个主意不错,我会试着让自己更有条理一些,谢谢你的建议。\n情感分析:用户对建议表现出认同和感激,同时展现出试图积极面对问题的态度。\n情感类别:认同、感激、积极\nLLM:不用谢,我很高兴能帮到你。记得给自己一些时间去适应新的计划,有任何需要随时可以跟我说哦!\n'
DEFAULT_QUERY_TEMPLATE = '用户:{query}\n'
DEFAULT_RESPONSE_TEMPLATE = 'LLM:{response}\n'
DEFAULT_CANDIDATES_TEMPLATE = '备选情感类别:[{candidate_str}]'
DEFAULT_ANALYSIS_TEMPLATE = '情感分析:{analysis}\n'
DEFAULT_LABELS_TEMPLATE = '情感类别:{labels}\n'
DEFAULT_ANALYSIS_PATTERN = '情感分析:(.*?)\n'
DEFAULT_LABELS_PATTERN = '情感类别:(.*?)($|\n)'
__init__(api_model: str = 'gpt-4o', sentiment_candidates: List[str] | None = None, max_round: Annotated[int, Ge(ge=0)] = 10, *, labels_key: str = 'dialog_sentiment_labels', analysis_key: str = 'dialog_sentiment_labels_analysis', api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, query_template: str | None = None, response_template: str | None = None, candidate_template: str | None = None, analysis_template: str | None = None, labels_template: str | None = None, analysis_pattern: str | None = None, labels_pattern: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 3, model_params: Dict = {}, sampling_params: Dict = {}, **kwargs)[source]

Initialization method.

Parameters:
  • api_model – API model name.

  • sentiment_candidates – The output sentiment candidates. Use open-domain sentiment labels if it is None.

  • max_round – The max num of round in the dialog to build the prompt.

  • labels_key – The key name in the meta field to store the output labels. It is ‘dialog_sentiment_labels’ in default.

  • analysis_key – The key name in the meta field to store the corresponding analysis. It is ‘dialog_sentiment_labels_analysis’ in default.

  • api_endpoint – URL endpoint for the API.

  • response_path – Path to extract content from the API response. Defaults to ‘choices.0.message.content’.

  • system_prompt – System prompt for the task.

  • query_template – Template for query part to build the input prompt.

  • response_template – Template for response part to build the input prompt.

  • candidate_template – Template for sentiment candidates to build the input prompt.

  • analysis_template – Template for analysis part to build the input prompt.

  • labels_template – Template for labels part to build the input prompt.

  • analysis_pattern – Pattern to parse the return sentiment analysis.

  • labels_pattern – Pattern to parse the return sentiment labels.

  • try_num – The number of retry attempts when there is an API call error or output parsing error.

  • model_params – Parameters for initializing the API model.

  • sampling_params – Extra parameters passed to the API call. e.g {‘temperature’: 0.9, ‘top_p’: 0.95}

  • kwargs – Extra keyword arguments.

build_input(history, query)[source]
parse_output(response)[source]
process_single(sample, rank=None)[source]

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample