trinity.trainer.verl.fsdp_workers module#
The main entry point to run the PPO algorithm. Modified from volcengine/verl
- class trinity.trainer.verl.fsdp_workers.ActorRolloutRefWorker(*args, **kwargs)[source]#
Bases:
Worker
This worker can be instantiated as a standalone actor or a standalone rollout or a standalone reference policy or a hybrid engine based on the config.rollout
- __init__(config: DictConfig, role: str)[source]#
Initialize the worker with environment settings and device configuration.
- Parameters:
cuda_visible_devices (str, optional) – CUDA visible devices configuration. Defaults to None.
- set_algorithm(algo_config: AlgorithmConfig)[source]#
- class trinity.trainer.verl.fsdp_workers.CriticWorker(*args, **kwargs)[source]#
Bases:
Worker
- __init__(config)[source]#
Initialize the worker with environment settings and device configuration.
- Parameters:
cuda_visible_devices (str, optional) – CUDA visible devices configuration. Defaults to None.