data_juicer.ops.mapper.clean_email_mapper module¶
- class data_juicer.ops.mapper.clean_email_mapper.CleanEmailMapper(pattern: str | None = None, repl: str = '', *args, **kwargs)[source]¶
Bases:
Mapper
Cleans email addresses from text samples using a regular expression.
This operator removes or replaces email addresses in the text based on a regular expression pattern. By default, it uses a standard pattern to match email addresses, but a custom pattern can be provided. The matched email addresses are replaced with a specified replacement string, which defaults to an empty string. The operation is applied to each text sample in the batch. If no email address is found in a sample, it remains unchanged.