trinity.common.workflows.math_ruler_workflow module#
Math workflow with RULER.
- class trinity.common.workflows.math_ruler_workflow.MathRULERWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[ModelWrapper] | None = None)[源代码]#
-
A workflow for math with RULER reward function.
Modified from MathWorkflow. Adapted from OpenPipe/ART
- __init__(*, task: Task, model: ModelWrapper, auxiliary_models: List[ModelWrapper] | None = None)[源代码]#
- reset(task: Task)[源代码]#
Note that in this workflow, MathRewardFn is only used for calculating the 'golden reward', whereasa the rewards used by RL training are calculated by RULER.
- run() List[Experience][源代码]#
Modified from SimpleWorkflow.run
- get_ruler_scores(responses: List[Experience], judger: Any) Tuple[bool, List[float]][源代码]#
Get RULER scores
- class trinity.common.workflows.math_ruler_workflow.AsyncMathRULERWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[ModelWrapper] | None = None)[源代码]#
-
- is_async: bool = True#
- async run_async() List[Experience][源代码]#
Modified from SimpleWorkflow.run