generate_qa_from_text_mapper¶
Generates question and answer pairs from text using a specified model.
This operator uses a Hugging Face model to generate QA pairs from the input text. It supports both Hugging Face and vLLM models for inference. The recommended models, such as 'alibaba-pai/pai-llama3-8b-doc2qa', are trained on Chinese data and are suitable for Chinese text. The operator can limit the number of generated QA pairs per text and allows custom output patterns for parsing the model's response. By default, it uses a regular expression to extract questions and answers from the model's output. If no QA pairs are extracted, a warning is logged.
使用指定模型从文本生成问题和答案对。
此算子使用Hugging Face模型从输入文本生成QA对。它支持使用Hugging Face和vLLM模型进行推理。推荐的模型,如'alibaba-pai/pai-llama3-8b-doc2qa',是在中文数据上训练的,适合处理中文文本。算子可以限制每段文本生成的QA对数量,并允许自定义输出模式来解析模型的响应。默认情况下,它使用正则表达式从模型的输出中提取问题和答案。如果没有提取到QA对,将记录一条警告。
Type 算子类型: mapper
Tags 标签: cpu, vllm, hf, text
🔧 Parameter Configuration 参数配置¶
name 参数名 |
type 类型 |
default 默认值 |
desc 说明 |
---|---|---|---|
|
<class 'str'> |
|
Huggingface model ID. |
|
typing.Optional[typing.Annotated[int, Gt(gt=0)]] |
|
The max num of returned QA sample for each text. Not limit if it is None. |
|
typing.Optional[str] |
|
Regular expression pattern to extract questions and answers from model response. |
|
<class 'bool'> |
|
Whether to use vllm for inference acceleration. |
|
typing.Optional[typing.Dict] |
|
Parameters for initializing the model. |
|
typing.Optional[typing.Dict] |
|
Sampling parameters for text generation, e.g {'temperature': 0.9, 'top_p': 0.95} |
|
|
Extra keyword arguments. The default data format parsed by this interface is as follows: Model Input: 蒙古国的首都是乌兰巴托(Ulaanbaatar) 冰岛的首都是雷克雅未克(Reykjavik) Model Output: 蒙古国的首都是乌兰巴托(Ulaanbaatar) 冰岛的首都是雷克雅未克(Reykjavik) Human: 请问蒙古国的首都是哪里? Assistant: 你好,根据提供的信息,蒙古国的首都是乌兰巴托(Ulaanbaatar)。 Human: 冰岛的首都是哪里呢? Assistant: 冰岛的首都是雷克雅未克(Reykjavik)。 ... |
📊 Effect demonstration 效果演示¶
not available 暂无