data_juicer.ops.mapper.annotation.human_preference_annotation_mapper module¶
- class data_juicer.ops.mapper.annotation.human_preference_annotation_mapper.HumanPreferenceAnnotationMapper(label_config_file: str = None, answer1_key: str = 'answer1', answer2_key: str = 'answer2', prompt_key: str = 'prompt', chosen_key: str = 'chosen', rejected_key: str = 'rejected', **kwargs)[source]¶
Bases:
LabelStudioAnnotationMapper
Operator for human preference annotation using Label Studio.
This operator formats and presents pairs of answers to a prompt for human evaluation. It uses a default or custom Label Studio configuration to display the prompt and answer options. The operator processes the annotations to determine the preferred answer, updating the sample with the chosen and rejected answers. The operator requires specific keys in the samples for the prompt and answer options. If these keys are missing, it logs warnings and uses placeholder text. The annotated results are processed to update the sample with the chosen and rejected answers.
- DEFAULT_LABEL_CONFIG = '\n <View className="root">\n <Style>\n .root {\n box-sizing: border-box;\n margin: 0;\n padding: 0;\n font-family: \'Roboto\',\n sans-serif;\n line-height: 1.6;\n background-color: #f0f0f0;\n }\n\n .container {\n margin: 0 auto;\n padding: 20px;\n background-color: #ffffff;\n border-radius: 5px;\n box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.1), 0 6px 20px 0 rgba(0, 0, 0, 0.1);\n }\n\n .prompt {\n padding: 20px;\n background-color: #0084ff;\n color: #ffffff;\n border-radius: 5px;\n margin-bottom: 20px;\n box-shadow: 0 2px 4px 0 rgba(0, 0, 0, 0.1), 0 3px 10px 0 rgba(0, 0, 0, 0.1);\n }\n\n .answers {\n display: flex;\n justify-content: space-between;\n flex-wrap: wrap;\n gap: 20px;\n }\n\n .answer-box {\n flex-basis: 49%;\n padding: 20px;\n background-color: rgba(44, 62, 80, 0.9);\n color: #ffffff;\n border-radius: 5px;\n box-shadow: 0 2px 4px 0 rgba(0, 0, 0, 0.1), 0 3px 10px 0 rgba(0, 0, 0, 0.1);\n }\n\n .answer-box p {\n word-wrap: break-word;\n }\n\n .answer-box:hover {\n background-color: rgba(52, 73, 94, 0.9);\n cursor: pointer;\n transition: all 0.3s ease;\n }\n\n .lsf-richtext__line:hover {\n background: unset;\n }\n\n .answer-box .lsf-object {\n padding: 20px\n }\n </Style>\n <View className="container">\n <View className="prompt">\n <Text name="prompt" value="$prompt" />\n </View>\n <View className="answers">\n <Pairwise name="comparison" toName="answer1,answer2"\n selectionStyle="background-color: #27ae60; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.2); border: 2px solid #2ecc71; cursor: pointer; transition: all 0.3s ease;"\n leftChoiceValue="answer1" rightChoiceValue="answer2" />\n <View className="answer-box">\n <Text name="answer1" value="$answer1" />\n </View>\n <View className="answer-box">\n <Text name="answer2" value="$answer2" />\n </View>\n </View>\n </View>\n </View>\n '¶
- __init__(label_config_file: str = None, answer1_key: str = 'answer1', answer2_key: str = 'answer2', prompt_key: str = 'prompt', chosen_key: str = 'chosen', rejected_key: str = 'rejected', **kwargs)[source]¶
Initialize the human preference annotation operator.
- Parameters:
label_config_file – Path to the label config file
answer1_key – Key for the first answer
answer2_key – Key for the second answer
prompt_key – Key for the prompt/question
chosen_key – Key for the chosen answer
rejected_key – Key for the rejected answer