data_juicer.ops.mapper.sentence_split_mapper module

class data_juicer.ops.mapper.sentence_split_mapper.SentenceSplitMapper(lang: str = 'en', *args, **kwargs)[源代码]

基类:Mapper

Mapper to split text samples to sentences.

__init__(lang: str = 'en', *args, **kwargs)[源代码]

Initialization method.

参数:
  • lang -- split sentence of text in which language.

  • args -- extra args

  • kwargs -- extra args

process_batched(samples)[源代码]