data_juicer.ops.grouper.naive_reverse_grouper module¶
- class data_juicer.ops.grouper.naive_reverse_grouper.NaiveReverseGrouper(batch_meta_export_path=None, *args, **kwargs)[源代码]¶
基类:
Grouper
Split batched samples into individual samples.
This operator processes a dataset by splitting each batched sample into individual samples. It also handles and optionally exports batch metadata. - If a sample contains 'batch_meta', it is separated and can be exported to a specified path. - The operator converts the remaining data from a dictionary of lists to a list of dictionaries, effectively unbatching the samples. - If batch_meta_export_path is provided, the batch metadata is written to this file in JSON format, one entry per line. - If no samples are present in the dataset, the original dataset is returned.