trinity.utils.math_eval_utils module
Utility functions for strictly parsing and evaluating mathematical answers.
This module is a modified and simplified version of the official evaluation code for Qwen2.5-Math, designed for easier standalone use.
Original source: https://github.com/QwenLM/Qwen2.5-Math
Key modifications include:
Retained only the core parsing logic for the common qwen_boxed prompt format.
Consolidated essential parsing and evaluation functions from multiple files into this single module.
Simplified benchmark handling and conditional logic for broader applicability.
Simplified or removed calls to external tools like TIR.
- trinity.utils.math_eval_utils.verify_math_answer(response_text: str, ground_truth: str) Tuple[float, Dict[str, Any]] [source]
Strictly compare the equality of response and groundtruth.
- trinity.utils.math_eval_utils.extract_answer(response_text: str) str | None [source]
Extract the equation from the string.
- trinity.utils.math_eval_utils.strip_string(input_str: str | None) str | None [source]
Clean and normalize math answer strings.