trinity.common.workflows.step_wise_workflow module#
- class trinity.common.workflows.step_wise_workflow.StepWiseRewardWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
Bases:
Workflow
A workflow that implements step-wise rewards for tasks.
- __init__(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
- run() list[Experience] [source]#
Run the workflow and return a list of experiences with step-wise rewards.
- abstract step(step_num: int) bool [source]#
Run a single step of your agent application.
- Parameters:
step_num (int) – The current step number.
- Returns:
Whether to continue running the agent application.
- Return type:
bool
- Tips:
You can use the openai client (self.client) to migrate your existing applications at low cost.
- abstract reward(exps: list[Experience], step_num: int) float [source]#
Calculate the reward for the given experiences at the specified step.
- abstract property max_step_num#
Return the maximum number of steps in the task.
- property repeatable#
A workflow is repeatable if it can be run multiple times within the run() or run_async() method.
- class trinity.common.workflows.step_wise_workflow.AsyncStepWiseRewardWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
Bases:
StepWiseRewardWorkflow
Async version of StepWiseRewardWorkflow.
- property asynchronous#
Whether the workflow runs in async mode.
- async run_async() list[Experience] [source]#
Run the workflow and return a list of experiences with step-wise rewards asynchronously.
- abstract async step_async(step_num: int) bool [source]#
Run a single step of your agent application asynchronously.
- Parameters:
step_num (int) – The current step number.
- Returns:
Whether to continue running the agent application.
- Return type:
bool
- Tips:
You can use the openai client (self.client) to migrate your existing applications at low cost.
- abstract async reward_async(exps: list[Experience], step_num: int) float [source]#
Calculate the reward for the given experiences at the specified step asynchronously.
- class trinity.common.workflows.step_wise_workflow.RewardPropagationWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
Bases:
Workflow
A workflow that propagates rewards across multiple turns.
- __init__(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
- run() list[Experience] [source]#
Run the workflow and return a list of experiences with step-wise rewards.
- abstract step(step_num: int) bool [source]#
Run a single step of your agent application.
- Parameters:
step_num (int) – The current step number.
- Returns:
Whether to continue running the agent application.
- Return type:
bool
- Tips:
You can use the openai client (self.client) to migrate your existing applications at low cost.
- abstract reward(exps: list[Experience]) float [source]#
Calculate the reward for the given experiences of the entire run.
- abstract property max_step_num#
Return the maximum number of steps in the task.
- property repeatable#
A workflow is repeatable if it can be run multiple times within the run() or run_async() method.
- class trinity.common.workflows.step_wise_workflow.AsyncRewardPropagationWorkflow(*, task: Task, model: ModelWrapper, auxiliary_models=None, use_openai_client=True)[source]#
Bases:
RewardPropagationWorkflow
Async version of RewardPropagationWorkflow.
- property asynchronous#
Whether the workflow runs in async mode.
- async run_async() list[Experience] [source]#
Run the workflow and return a list of experiences with step-wise rewards asynchronously.
- abstract async step_async(step_num: int) bool [source]#
Run a single step of your agent application asynchronously.
- Parameters:
step_num (int) – The current step number.
- Returns:
Whether to continue running the agent application.
- Return type:
bool
- Tips:
You can use the openai client (self.client) to migrate your existing applications at low cost.
- abstract async reward_async(exps: list[Experience]) float [source]#
Calculate the reward for the given experiences of the entire run asynchronously.