trinity.common.rewards.format_reward module#

Base Reward Function Class.

class trinity.common.rewards.format_reward.FormatReward(pattern: str | None = None)[源代码]#

基类：RewardFn

A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: huggingface/open-r1

__init__(pattern: str | None = None)[源代码]#

trinity.common.rewards.format_reward module

目录

trinity.common.rewards.format_reward module#