data_juicer.ops.mapper.remove_bibliography_mapper module¶
- class data_juicer.ops.mapper.remove_bibliography_mapper.RemoveBibliographyMapper(*args, **kwargs)[源代码]¶
基类:
Mapper
Removes bibliography sections at the end of LaTeX documents.
This operator identifies and removes bibliography sections in LaTeX documents. It uses a regular expression to match common bibliography commands such as appendix, begin{references}, begin{thebibliography}, and bibliography. The matched sections are removed from the text. The operator processes samples in batch mode for efficiency.