trinity.common.workflows.eval_workflow module#
Evaluation Workflow Class
- class trinity.common.workflows.eval_workflow.MathEvalWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#
Bases:
Workflow
A workflow for standard math evaluation.
The evaluation standard and prompting style are follow the Qwen2.5-Math model’s evaluation methodology. For more details on their approach, see: QwenLM/Qwen2.5-Math
- __init__(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#
- run() List[Experience] [source]#
Run workflow and return a list of experiences.
- class trinity.common.workflows.eval_workflow.AsyncMathEvalWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#
Bases:
MathEvalWorkflow
- is_async: bool = True#
- async run_async() List[Experience] [source]#
Run workflow in async and return a list of experiences.