trinity.common.rewards.eval_utils module

目录

trinity.common.rewards.eval_utils module#

trinity.common.rewards.eval_utils.parse_with_timeout(pred: str, parsing_timeout: int = 5, **kwargs) → list[str][源代码]#

trinity.common.rewards.eval_utils.verify_with_timeout(gold: str, target: str, timeout_seconds: int = 5, **kwargs) → bool[源代码]#

trinity.common.rewards.eval_utils.simple_answer_parser(response: str) → list[str][源代码]#

trinity.common.rewards.eval_utils.find_boxed_answer(raw_answer, timeout=10)[源代码]#

Find answers from solutions where the answers are enclosed in LaTeX's boxed tag

参数:

raw_answer (str) -- raw answer from model
timeout (int) -- timeout in seconds for regex

返回:

answer if found, otherwise None

返回类型:

str

trinity.common.rewards.eval_utils.extract_solution(solution_str)[源代码]#: Extract the equation from the solution string.

trinity.common.rewards.eval_utils.validate_equation(equation_str, available_numbers)[源代码]#: Validate that equation only uses available numbers and each number once.

trinity.common.rewards.eval_utils.evaluate_equation(equation_str)[源代码]#: Safely evaluate the arithmetic equation using eval() with precautions.

trinity.common.rewards.eval_utils.validate_think_pattern(text)[源代码]#: Validate whether the <think> </think> tag is properly formatted.

trinity.common.rewards.eval_utils.compute_score_v0(solution_str, ground_truth) → float[源代码]#: Compute the score of the solution string against the ground truth. This function suits easily-verifiable problems; the answer is put within oxed{}.

trinity.common.rewards.eval_utils.is_equiv(str1, str2, verbose=False)[源代码]#

trinity.common.rewards.eval_utils.remove_boxed(s)[源代码]#

trinity.common.rewards.eval_utils.last_boxed_only_string(string)[源代码]#: Extracts the last oxed{...} or ` box{...}` substring from the input string.

trinity.common.rewards.eval_utils.remove_right_units(string)[源代码]#