trinity.utils.math_eval_utils module

Utility functions for strictly parsing and evaluating mathematical answers.

This module is a modified and simplified version of the official evaluation code for Qwen2.5-Math, designed for easier standalone use.

Original source: https://github.com/QwenLM/Qwen2.5-Math

Key modifications include:

Retained only the core parsing logic for the common qwen_boxed prompt format.
Consolidated essential parsing and evaluation functions from multiple files into this single module.
Simplified benchmark handling and conditional logic for broader applicability.
Simplified or removed calls to external tools like TIR.

trinity.utils.math_eval_utils.verify_math_answer(response_text: str, ground_truth: str) → Tuple[float, Dict[str, Any]][source]: Strictly compare the equality of response and groundtruth.

trinity.utils.math_eval_utils.extract_answer(response_text: str) → str | None[source]: Extract the equation from the string.

trinity.utils.math_eval_utils.strip_string(input_str: str | None) → str | None[source]: Clean and normalize math answer strings.

trinity.utils.math_eval_utils.fix_fracs(string)[source]

trinity.utils.math_eval_utils.fix_a_slash_b(string)[source]

trinity.utils.math_eval_utils.fix_sqrt(string)[source]

trinity.utils.math_eval_utils.convert_word_number(text: str) → str[source]

trinity.utils.math_eval_utils.math_equal(prediction: str | None, reference: str | None) → bool[source]: Checks the mathematical equality of two strings by trying different methods.

trinity.utils.math_eval_utils.numeric_equal(prediction: float, reference: float) → bool[source]

trinity.utils.math_eval_utils.symbolic_equal(a: str, b: str) → bool[source]: Compares two strings for symbolic equivalence using sympy.

Other Versions v: main

Tags: v0.1.0; v0.1.1; v0.2.0; v0.2.1

Branches: main (latest)