trinity.common.rewards.format_reward module

Base Reward Function Class.

class trinity.common.rewards.format_reward.FormatReward(pattern: str | None = None)[source]

Bases: RewardFn

A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py

__init__(pattern: str | None = None)[source]