trinity.common.rewards.accuracy_reward module

trinity.common.rewards.accuracy_reward module#

Accuracy Reward Function Class.

class trinity.common.rewards.accuracy_reward.AccuracyReward(answer_parser: Callable[[str], str] | None = None)[源代码]#

基类:RewardFn

A reward function that rewards correct answers. Ref: huggingface/open-r1

__init__(answer_parser: Callable[[str], str] | None = None)[源代码]#