data_juicer.ops.mapper.whitespace_normalization_mapper module¶
- class data_juicer.ops.mapper.whitespace_normalization_mapper.WhitespaceNormalizationMapper(*args, **kwargs)[source]¶
Bases:
Mapper
Mapper to normalize different kinds of whitespaces to whitespace ‘ ‘ (0x20) in text samples.
Different kinds of whitespaces can be found here: https://en.wikipedia.org/wiki/Whitespace_character