# key_value_grouper Groups samples into batches based on values in specified keys. This operator groups samples by the values of the given keys, which can be nested. If no keys are provided, it defaults to using the text key. It uses a naive grouping strategy to batch samples with identical key values. The resulting dataset is a list of batched samples, where each batch contains samples that share the same key values. This is useful for organizing data by specific attributes or features. 根据指定键的值对样本进行分组。 该算子根据给定键的值对样本进行分组,这些键可以是嵌套的。如果没有提供键,则默认使用文本键。它使用一种简单的分组策略来将具有相同键值的样本分批。生成的数据集是一个批次样本列表,每个批次包含具有相同键值的样本。这对于按特定属性或特征组织数据非常有用。 Type 算子类型: **grouper** Tags 标签: cpu, text ## 🔧 Parameter Configuration 参数配置 | name 参数名 | type 类型 | default 默认值 | desc 说明 | |--------|------|--------|------| | `group_by_keys` | typing.Optional[typing.List[str]] | `None` | group samples according values in the keys. | | `args` | | `''` | extra args | | `kwargs` | | `''` | extra args | ## 📊 Effect demonstration 效果演示 ### test_key_value_grouper ```python KeyValueGrouper(['meta.language']) ``` #### 📥 input data 输入数据
Today is Sunday and it's a happy day!
Welcome to Alibaba.
欢迎来到阿里巴巴!