trinity.trainer.verl.fsdp_workers module
The main entry point to run the PPO algorithm. Modified from https://github.com/volcengine/verl/blob/v0.4.1/verl/workers/fsdp_workers.py
- class trinity.trainer.verl.fsdp_workers.ActorRolloutRefWorker(*args, **kwargs)[source]
Bases:
Worker
This worker can be instantiated as a standalone actor or a standalone rollout or a standalone reference policy or a hybrid engine based on the config.rollout
- __init__(config: DictConfig, role: str)[source]
Initialize the worker with environment settings and device configuration.
- Parameters:
cuda_visible_devices (str, optional) – CUDA visible devices configuration. Defaults to None.
- set_algorithm(algo_config: AlgorithmConfig)[source]