Overview

Welcome to the RM-Gallery tutorial series! This directory contains comprehensive guides to help you master reward models.

🗺️ Learning Paths

🌱 Beginner Path

Goal: Get started with reward models in 30 minutes

Quickstart Guide - Install, use, and evaluate your first RM (5 min)
Building RM Overview - Understand RM types and architecture (10 min)
Using Built-in RMs - Explore 35+ pre-built models (15 min)

🚀 Intermediate Path

Goal: Build and customize reward models

Building Custom RMs - Create rule-based and LLM-based rewards (30 min)
Data Pipeline - Load, process, and transform data (20 min)
End-to-End Tutorial - Complete workflow from data to deployment (30 min)

🎓 Advanced Path

Goal: Train, evaluate, and deploy at scale

Training RM Overview - Understand training paradigms and setup (15 min)
Training with VERL - Complete RL-based training workflow (60 min)
High-Performance Serving - Deploy RM as production service (45 min)

📚 Tutorial Catalog

Building Reward Models

Tutorial	Level	Time	Description
Overview	Beginner	10 min	Introduction to building RMs
Ready-to-Use RMs	Beginner	15 min	Using pre-built models
Custom Rewards	Intermediate	30 min	Building custom RMs
Auto Rubric	Advanced	45 min	Automatic rubric generation

Training Reward Models

Tutorial	Level	Time	Description
Training Overview	Intermediate	15 min	Introduction to training
Bradley-Terry RM	Advanced	60 min	Training Bradley-Terry models
SFT RM	Advanced	45 min	Training with SFT
RL Training	Advanced	90 min	Full RL-based training

Evaluating Reward Models

Tutorial	Level	Time	Description
Evaluation Overview	Beginner	10 min	Introduction to evaluation
RMB	Intermediate	30 min	Reward Model Benchmark
RM-Bench	Intermediate	30 min	Subtlety and style evaluation
JudgeBench	Intermediate	30 min	Judge capability testing
RewardBench2	Intermediate	30 min	Latest benchmark
Conflict Detector	Advanced	45 min	Detect evaluation conflicts

Data Processing

Tutorial	Level	Time	Description
Data Pipeline	Beginner	20 min	Complete data workflow
Data Annotation	Intermediate	30 min	Annotating training data
Data Loading	Beginner	15 min	Loading from various sources
Data Processing	Intermediate	25 min	Transforming data

Applications

Tutorial	Level	Time	Description
RM Server	Advanced	45 min	Deploy RM as service
Best-of-N	Intermediate	20 min	Select best response
Data Refinement	Intermediate	30 min	Improve data quality
Post Training	Advanced	60 min	RLHF integration

🎯 By Use Case

I want to...

Evaluate AI responses → Start with Quickstart → Then Using Built-in RMs

Build a custom reward model → Read Building Custom RMs → Try End-to-End Tutorial

Train my own reward model → Start with Training Overview → Then RL Training

Test on benchmarks → Read Evaluation Overview → Try specific benchmarks: RMB, RM-Bench, RewardBench2

Deploy to production → Follow RM Server Guide → Implement Best-of-N

Process custom data → Read Data Pipeline → Use Data Loading

💡 Tutorial Tips

Before You Start

✅ Install RM-Gallery: pip install rm-gallery
✅ Set up Python environment (>= 3.10, < 3.13)
✅ (Optional) Get API credentials for LLM-based models

While Learning

📖 Read in order: Tutorials build on each other
💻 Run the code: Try examples in your environment
🔄 Experiment: Modify code and see what happens
❓ Ask questions: Use GitHub Discussions

After Completing

🎯 Apply to your project: Use what you learned
🤝 Share feedback: Help us improve tutorials
📝 Contribute: Add your own examples

🔗 Quick Links

Essential

Quickstart Guide - Get started in 5 minutes
FAQ - Common questions answered

Interactive

End-to-End Tutorial - Complete project

Reference

RM Library - All available models
Rubric Library - Evaluation rubrics
Contribution Guide - How to contribute

🆘 Getting Help

Stuck on a tutorial?

Check the FAQ first
Search GitHub Issues
Ask in GitHub Discussions
Join our community channels

Found an error?

Please open a GitHub Issue with the tutorial name and problem description.

🚀 Next Steps

After completing the tutorials:

Build your first project using RM-Gallery
Share your experience with the community
Contribute back with examples or improvements
Stay updated on new features and models

Ready to start? Go to the Quickstart Guide 🎉

Have questions? Check the FAQ or ask in Discussions 💬

Want to contribute? Read our Contribution Guide 🤝