trinity.common.rewards
Submodules
trinity.common.rewards.accuracy_reward module
- class trinity.common.rewards.accuracy_reward.AccuracyRewardShapper(answer_parser: Callable[[str], str], correct_reward: float = 1.0, incorrect_reward: float = 0.0, kwargs: Dict[str, Any] = {})[source]
Bases:
RewardShapper
Shapper for accuracy-based rewards
trinity.common.rewards.agents_reward module
trinity.common.rewards.base module
trinity.common.rewards.composite_reward module
- class trinity.common.rewards.composite_reward.CompositeRewardShapper(shappers: List[Tuple[RewardShapper, float]])[source]
Bases:
RewardShapper
Combines multiple shappers with weights
- __init__(shappers: List[Tuple[RewardShapper, float]])[source]
trinity.common.rewards.format_reward module
- class trinity.common.rewards.format_reward.FormatRewardShapper(pattern: str, correct_format_reward: float = 1.0, incorrect_format_reward: float = 0.0)[source]
Bases:
RewardShapper
Shapper for format-based rewards
trinity.common.rewards.human_reward module
trinity.common.rewards.reward_fn module
Base Reward Function Class.
- class trinity.common.rewards.reward_fn.AccuracyReward(answer_parser: Callable[[str], str] | None = None)[source]
Bases:
RewardFn
A reward function that rewards correct answers. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py
- class trinity.common.rewards.reward_fn.FormatReward(pattern: str | None = None)[source]
Bases:
RewardFn
A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py
- class trinity.common.rewards.reward_fn.MathRewardFn(answer_parser=<function simple_answer_parser>, pattern='.*?<think>.*?</think>\\s*<answer>.*?</answer>\\s*$')[source]
Bases:
RewardFn
A reward function that rewards for math task.
- DEFAULT_FORMAT_PATTERN = '.*?<think>.*?</think>\\s*<answer>.*?</answer>\\s*$'
- DEFAULT_ANSWER_PARSER() str
trinity.common.rewards.tool_reward module
Module contents
Reward functions for RFT
- class trinity.common.rewards.AccuracyReward(answer_parser: Callable[[str], str] | None = None)[source]
Bases:
RewardFn
A reward function that rewards correct answers. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py
- class trinity.common.rewards.FormatReward(pattern: str | None = None)[source]
Bases:
RewardFn
A reward function that checks if the reasoning process is enclosed within <think> and </think> tags, while the final answer is enclosed within <answer> and </answer> tags. Ref: https://github.com/huggingface/open-r1/blob/main/src/open_r1/rewards.py