data_juicer.ops.mapper.calibrate_query_mapper module

class data_juicer.ops.mapper.calibrate_query_mapper.CalibrateQueryMapper(api_model: str = 'gpt-4o', *, api_endpoint: str | None = None, response_path: str | None = None, system_prompt: str | None = None, input_template: str | None = None, reference_template: str | None = None, qa_pair_template: str | None = None, output_pattern: str | None = None, try_num: Annotated[int, Gt(gt=0)] = 3, model_params: Dict = {}, sampling_params: Dict = {}, **kwargs)[source]

Bases: CalibrateQAMapper

Calibrate query in question-answer pairs based on reference text.

This operator adjusts the query (question) in a question-answer pair to be more detailed and accurate, while ensuring it can still be answered by the original answer. It uses a reference text to inform the calibration process. The calibration is guided by a system prompt, which instructs the model to refine the question without adding extraneous information. The output is parsed to extract the calibrated query, with any additional content removed.

DEFAULT_SYSTEM_PROMPT = '请根据提供的【参考信息】对问答对中的【问题】进行校准,        使其更加详细、准确,且仍可以由原答案回答。只输出校准后的问题,不要输出多余内容。'
parse_output(raw_output)[source]