data_juicer.ops.mapper.remove_comments_mapper module

class data_juicer.ops.mapper.remove_comments_mapper.RemoveCommentsMapper(doc_type: str | List[str] = 'tex', inline: bool = True, multiline: bool = True, *args, **kwargs)[源代码]

基类:Mapper

Mapper to remove comments in different kinds of documents.

Only support 'tex' for now.

__init__(doc_type: str | List[str] = 'tex', inline: bool = True, multiline: bool = True, *args, **kwargs)[源代码]

Initialization method.

参数:
  • doc_type -- Type of document to remove comments.

  • inline -- Whether to remove inline comments.

  • multiline -- Whether to remove multiline comments.

  • args -- extra args

  • kwargs -- extra args

process_batched(samples)[源代码]