Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
data-juicer
data-juicer

docs

  • Operator Schemas 算子提要
  • Data Recipe Gallery
  • Dataset Configuration Guide
  • “Bad” Data Exhibition
  • Motivation
  • Roadmap
  • Start the Service
  • API Calls
  • Demonstration
  • How-to Guide for Developers
  • Distributed Data Processing in Data-Juicer
  • User Guide
  • Developer Guide
  • 用户指南
  • 开发者指南
  • Awesome Data-Model Co-Development of MLLMs
  • News
  • Contribution to This Survey
  • References
  • “Section - Mentioned Papers” Retrieval List

demos

  • Demos

tools

  • Distributed Fuzzy Deduplication Tools
  • Auto Evaluation Toolkit
  • GPT EVAL: Evaluate your model with OpenAI API
  • Evaluation Results Recorder
  • Format Conversion Tools
  • Multimodal Tools
  • Post Tuning Tools
  • Hyper-parameter Optimization for Data Recipe
  • Label Studio Service Utility
  • Metrics for video generation
  • Postprocess tools
  • Preprocess Tools
  • Data Scoring Capabilities
  • Quality Classifier Toolkit (GPT-3 Reproduced)

thirdparty

  • LLM Ecosystems
  • Third-party Model Library

API Reference

  • API Reference
    • data_juicer.core package
      • data_juicer.core.data package
      • data_juicer.core.executor package
    • data_juicer.ops package
      • data_juicer.ops.aggregator package
      • data_juicer.ops.common package
      • data_juicer.ops.deduplicator package
      • data_juicer.ops.filter package
      • data_juicer.ops.grouper package
      • data_juicer.ops.mapper package
        • data_juicer.ops.mapper.annotation package
      • data_juicer.ops.selector package
    • data_juicer.ops.filter package
    • data_juicer.ops.mapper package
      • data_juicer.ops.mapper.annotation package
    • data_juicer.ops.deduplicator package
    • data_juicer.ops.selector package
    • data_juicer.ops.common package
    • data_juicer.analysis package
    • data_juicer.config package
    • data_juicer.format package
en|v1.3.3
Language
English 简体中文
Version
v1.3.3 main
Back to top
Copyright © 2024, Data-Juicer Team
Made with Sphinx and @pradyunsg's Furo