nested_aggregator¶
Aggregates nested content from multiple samples into a single summary.
This operator uses a recursive summarization approach to aggregate content from multiple samples. It processes the input text, which is split into sub-documents, and generates a summary that maintains the average length of the original documents. The aggregation is performed using an API model, guided by system prompts and templates. The operator supports retrying the API call in case of errors and allows for customization of the summarization process through various parameters. The default system prompt and templates are provided in Chinese, but they can be customized. The operator uses a Hugging Face tokenizer to handle tokenization.
聚合来自多个样本的嵌套内容,生成单一摘要。
该算子使用递归汇总方法来聚合来自多个样本的内容。它处理被分割成子文档的输入文本,并生成一个保持原始文档平均长度的摘要。聚合过程使用 API 模型进行,由系统提示和模板指导。该算子支持在出错时重试 API 调用,并允许通过各种参数自定义汇总过程。默认的系统提示和模板以中文提供,但可以自定义。该算子使用 Hugging Face 分词器来处理分词。
Type 算子类型: aggregator
Tags 标签: cpu, api, text
🔧 Parameter Configuration 参数配置¶
name 参数名 |
type 类型 |
default 默认值 |
desc 说明 |
---|---|---|---|
|
<class ‘str’> |
|
API model name. |
|
<class ‘str’> |
|
The input key in the meta field of the samples. |
|
<class ‘str’> |
|
The output key in the aggregation field in the |
|
typing.Optional[typing.Annotated[int, Gt(gt=0)]] |
|
The max token num of the total tokens of the |
|
typing.Optional[str] |
|
URL endpoint for the API. |
|
typing.Optional[str] |
|
Path to extract content from the API response. |
|
typing.Optional[str] |
|
The system prompt. |
|
typing.Optional[str] |
|
The template for input text in each sample. |
|
typing.Optional[str] |
|
The input template. |
|
typing.Annotated[int, Gt(gt=0)] |
|
The number of retry attempts when there is an API |
|
typing.Dict |
|
Parameters for initializing the API model. |
|
typing.Dict |
|
Extra parameters passed to the API call. |
|
|
Extra keyword arguments. |
📊 Effect demonstration 效果演示¶
not available 暂无