Welcome to the RM-Gallery tutorial series! This directory contains comprehensive guides to help you master reward models.
🗺️ Learning Paths
🌱 Beginner Path
Goal: Get started with reward models in 30 minutes
- Quickstart Guide - Install, use, and evaluate your first RM (5 min)
- Building RM Overview - Understand RM types and architecture (10 min)
- Using Built-in RMs - Explore 35+ pre-built models (15 min)
🚀 Intermediate Path
Goal: Build and customize reward models
- Building Custom RMs - Create rule-based and LLM-based rewards (30 min)
- Data Pipeline - Load, process, and transform data (20 min)
- End-to-End Tutorial - Complete workflow from data to deployment (30 min)
🎓 Advanced Path
Goal: Train, evaluate, and deploy at scale
- Training RM Overview - Understand training paradigms and setup (15 min)
- Training with VERL - Complete RL-based training workflow (60 min)
- High-Performance Serving - Deploy RM as production service (45 min)
📚 Tutorial Catalog
Building Reward Models
| Tutorial | Level | Time | Description |
|---|---|---|---|
| Overview | Beginner | 10 min | Introduction to building RMs |
| Ready-to-Use RMs | Beginner | 15 min | Using pre-built models |
| Custom Rewards | Intermediate | 30 min | Building custom RMs |
| Auto Rubric | Advanced | 45 min | Automatic rubric generation |
Training Reward Models
| Tutorial | Level | Time | Description |
|---|---|---|---|
| Training Overview | Intermediate | 15 min | Introduction to training |
| Bradley-Terry RM | Advanced | 60 min | Training Bradley-Terry models |
| SFT RM | Advanced | 45 min | Training with SFT |
| RL Training | Advanced | 90 min | Full RL-based training |
Evaluating Reward Models
| Tutorial | Level | Time | Description |
|---|---|---|---|
| Evaluation Overview | Beginner | 10 min | Introduction to evaluation |
| RMB | Intermediate | 30 min | Reward Model Benchmark |
| RM-Bench | Intermediate | 30 min | Subtlety and style evaluation |
| JudgeBench | Intermediate | 30 min | Judge capability testing |
| RewardBench2 | Intermediate | 30 min | Latest benchmark |
| Conflict Detector | Advanced | 45 min | Detect evaluation conflicts |
Data Processing
| Tutorial | Level | Time | Description |
|---|---|---|---|
| Data Pipeline | Beginner | 20 min | Complete data workflow |
| Data Annotation | Intermediate | 30 min | Annotating training data |
| Data Loading | Beginner | 15 min | Loading from various sources |
| Data Processing | Intermediate | 25 min | Transforming data |
Applications
| Tutorial | Level | Time | Description |
|---|---|---|---|
| RM Server | Advanced | 45 min | Deploy RM as service |
| Best-of-N | Intermediate | 20 min | Select best response |
| Data Refinement | Intermediate | 30 min | Improve data quality |
| Post Training | Advanced | 60 min | RLHF integration |
🎯 By Use Case
I want to...
Evaluate AI responses → Start with Quickstart → Then Using Built-in RMs
Build a custom reward model → Read Building Custom RMs → Try End-to-End Tutorial
Train my own reward model → Start with Training Overview → Then RL Training
Test on benchmarks → Read Evaluation Overview → Try specific benchmarks: RMB, RM-Bench, RewardBench2
Deploy to production → Follow RM Server Guide → Implement Best-of-N
Process custom data → Read Data Pipeline → Use Data Loading
💡 Tutorial Tips
Before You Start
- ✅ Install RM-Gallery:
pip install rm-gallery - ✅ Set up Python environment (>= 3.10, < 3.13)
- ✅ (Optional) Get API credentials for LLM-based models
While Learning
- 📖 Read in order: Tutorials build on each other
- 💻 Run the code: Try examples in your environment
- 🔄 Experiment: Modify code and see what happens
- ❓ Ask questions: Use GitHub Discussions
After Completing
- 🎯 Apply to your project: Use what you learned
- 🤝 Share feedback: Help us improve tutorials
- 📝 Contribute: Add your own examples
🔗 Quick Links
Essential
- Quickstart Guide - Get started in 5 minutes
- FAQ - Common questions answered
Interactive
- End-to-End Tutorial - Complete project
Reference
- RM Library - All available models
- Rubric Library - Evaluation rubrics
- Contribution Guide - How to contribute
🆘 Getting Help
Stuck on a tutorial?
- Check the FAQ first
- Search GitHub Issues
- Ask in GitHub Discussions
- Join our community channels
Found an error?
Please open a GitHub Issue with the tutorial name and problem description.
🚀 Next Steps
After completing the tutorials:
- Build your first project using RM-Gallery
- Share your experience with the community
- Contribute back with examples or improvements
- Stay updated on new features and models
Ready to start? Go to the Quickstart Guide 🎉
Have questions? Check the FAQ or ask in Discussions 💬
Want to contribute? Read our Contribution Guide 🤝