trinity.common.rewards.format_reward module
Base Reward Function Class.
- class trinity.common.rewards.format_reward.FormatReward(pattern: str | None = None)[source]
Bases:
RewardFn
A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py