data_juicer.ops.mapper.remove_long_words_mapper module¶
- class data_juicer.ops.mapper.remove_long_words_mapper.RemoveLongWordsMapper(min_len: int = 1, max_len: int = 9223372036854775807, *args, **kwargs)[源代码]¶
基类:
Mapper
Mapper to remove long words within a specific range.
- __init__(min_len: int = 1, max_len: int = 9223372036854775807, *args, **kwargs)[源代码]¶
Initialization method.
- 参数:
min_len -- The min mapper word length in this op, words will be filtered if their length is below this parameter.
max_len -- The max mapper word length in this op, words will be filtered if their length exceeds this parameter.
args -- extra args
kwargs -- extra args