data_juicer.ops.grouper.naive_grouper module

class data_juicer.ops.grouper.naive_grouper.NaiveGrouper(*args, **kwargs)[源代码]

基类:Grouper

Group all samples in a dataset into a single batched sample.

This operator takes a dataset and combines all its samples into one batched sample. If the input dataset is empty, it returns an empty dataset. The resulting batched sample is a dictionary where each key corresponds to a list of values from all samples in the dataset.

__init__(*args, **kwargs)[源代码]

Initialization method.

参数:
  • args -- extra args

  • kwargs -- extra args

process(dataset)[源代码]

Dataset --> dataset.

参数:

dataset -- input dataset

返回:

dataset of batched samples.