data_juicer.ops.filter.suffix_filter module¶
- class data_juicer.ops.filter.suffix_filter.SuffixFilter(suffixes: str | List[str] = [], *args, **kwargs)[源代码]¶
基类:
Filter
Filter to keep samples with specified suffix.
- __init__(suffixes: str | List[str] = [], *args, **kwargs)[源代码]¶
Initialization method.
- 参数:
suffixes -- the suffix of text that will be keep. For example: '.txt', 'txt' or ['txt', '.pdf', 'docx']
args -- extra args
kwargs -- extra args