trinity.common.rewards.accuracy_reward module
Accuracy Reward Function Class.
-
class trinity.common.rewards.accuracy_reward.AccuracyReward(answer_parser: Callable[[str], str] | None = None)[源代码]
基类:RewardFn
A reward function that rewards correct answers.
Ref: huggingface/open-r1
-
__init__(answer_parser: Callable[[str], str] | None = None)[源代码]