Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
Data Juicer
Data Juicer

Tutorial

  • DJ-Cookbook
  • Installation Guide
  • Quick Start

docs

  • Operator Schemas 算子提要
  • Data Recipe Gallery
  • Dataset Configuration Guide
  • “Bad” Data Exhibition
  • DJ-SORA
  • DJ_service
  • How-to Guide for Developers
  • Distributed Data Processing in Data-Juicer
  • Sandbox
  • Awesome Data-Model Co-Development of MLLMs

demos

  • Demos

tools

  • Distributed Fuzzy Deduplication Tools
  • Auto Evaluation Toolkit
  • GPT EVAL: Evaluate your model with OpenAI API
  • Evaluation Results Recorder
  • Format Conversion Tools
  • Multimodal Tools
  • Post Tuning Tools
  • Hyper-parameter Optimization for Data Recipe
  • Label Studio Service Utility
  • Metrics for video generation
  • Postprocess tools
  • Preprocess Tools
  • Data Scoring

thirdparty

  • LLM Ecosystems
  • Third-party Model Library

API Reference

  • API Reference
    • data_juicer.core
    • data_juicer.ops
    • data_juicer.ops.filter
    • data_juicer.ops.mapper
    • data_juicer.ops.deduplicator
    • data_juicer.ops.selector
    • data_juicer.ops.common
    • data_juicer.analysis
    • data_juicer.config
    • data_juicer.format
en|v1.4.1
Language
English 简体中文
Version
main v1.4.2 v1.4.1 v1.4.0
Back to top
View this page

data_juicer.ops.mapper.clean_copyright_mapper module¶

class data_juicer.ops.mapper.clean_copyright_mapper.CleanCopyrightMapper(*args, **kwargs)[source]¶

Bases: Mapper

Mapper to clean copyright comments at the beginning of the text samples.

__init__(*args, **kwargs)[source]¶

Initialization method.

Parameters:
  • args – extra args

  • kwargs – extra args

process_batched(samples)[source]¶
Copyright © 2024, Data-Juicer Team
Made with Sphinx and @pradyunsg's Furo
On this page
  • data_juicer.ops.mapper.clean_copyright_mapper module
    • CleanCopyrightMapper
      • CleanCopyrightMapper.__init__()
      • CleanCopyrightMapper.process_batched()