Welcome to Trinity-RFT’s documentation!

Welcome to Trinity-RFT’s documentation!#

💡 What is Trinity-RFT?#

Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuning (RFT) of large language models (LLMs). It decouples the RFT process into three key components: Explorer, Trainer, and Buffer, and provides functionalities for users with different backgrounds and objectives:

  • 🤖 For agent application developers. [tutorial]

  • 🧠 For RL algorithm researchers. [tutorial]

    • Design and validate new reinforcement learning algorithms using compact, plug-and-play modules.

    • Example: Mixture of SFT and GRPO

  • 📊 For data engineers. [tutorial]

    • Create task-specific datasets and build data pipelines for cleaning, augmentation, and human-in-the-loop scenarios.

    • Example: Data Processing

🌟 Key Features#

  • Flexible RFT Modes:

    • Supports synchronous/asynchronous, on-policy/off-policy, and online/offline training. Rollout and training can run separately and scale independently across devices.

    RFT modes supported by Trinity-RFT
  • General Agentic-RL Support:

    • Supports both concatenated and general multi-turn agentic workflows. Able to directly train agent applications developed using agent frameworks like AgentScope.

    Agentic workflows
  • Full Lifecycle Data Pipelines:

    • Enables pipeline processing of rollout and experience data, supporting active management (prioritization, cleaning, augmentation) throughout the RFT lifecycle.

    Data pipeline design
  • User-Friendly Design:

    • Modular, decoupled architecture for easy adoption and development. Rich graphical user interfaces enable low-code usage.

    System architecture

Acknowledgements#

This project is built upon many excellent open-source projects, including:

Citation#

@misc{trinity-rft,
      title={Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models},
      author={Xuchen Pan and Yanxi Chen and Yushuo Chen and Yuchang Sun and Daoyuan Chen and Wenhao Zhang and Yuexiang Xie and Yilun Huang and Yilei Zhang and Dawei Gao and Yaliang Li and Bolin Ding and Jingren Zhou},
      year={2025},
      eprint={2505.17826},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.17826},
}