trinity.common.rewards.format_reward module#
Base Reward Function Class.
- class trinity.common.rewards.format_reward.FormatReward(pattern: str | None = None)[source]#
Bases:
RewardFn
A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: huggingface/open-r1