trinity.common.workflows.eval_workflow module#

Evaluation Workflow Class

class trinity.common.workflows.eval_workflow.MathEvalWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#

Bases: Workflow

A workflow for standard math evaluation.

The evaluation standard and prompting style are follow the Qwen2.5-Math model’s evaluation methodology. For more details on their approach, see: QwenLM/Qwen2.5-Math

__init__(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#

format_messages()[source]#: Format message for the evaluation of qwen_boxed type.

run() → List[Experience][source]#: Run workflow and return a list of experiences.

class trinity.common.workflows.eval_workflow.AsyncMathEvalWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models: List[OpenAI] | None = None)[source]#

Bases: MathEvalWorkflow

is_async: bool = True#

async run_async() → List[Experience][source]#: Run workflow in async and return a list of experiences.

trinity.common.workflows.eval_workflow module

Contents

trinity.common.workflows.eval_workflow module#