data_juicer.format.load module¶
- data_juicer.format.load.load_formatter(dataset_path, text_keys=None, suffixes=None, add_suffix=False, **kwargs) BaseFormatter [源代码]¶
Load the appropriate formatter for different types of data formats.
- 参数:
dataset_path -- Path to dataset file or dataset directory
text_keys -- key names of field that stores sample text. Default: None
suffixes -- the suffix of files that will be read. Default: None
add_suffix -- whether to add the file suffix to dataset meta. Default: False
- 返回:
a dataset formatter.