data_juicer.ops.mapper.remove_header_mapper module

class data_juicer.ops.mapper.remove_header_mapper.RemoveHeaderMapper(drop_no_head: bool = True, *args, **kwargs)[source]

Bases: Mapper

Mapper to remove headers at the beginning of documents in Latex samples.

__init__(drop_no_head: bool = True, *args, **kwargs)[source]

Initialization method.

Parameters:
  • drop_no_head – whether to drop sample texts without headers.

  • args – extra args

  • kwargs – extra args

process_batched(samples)[source]