trinity.algorithm.advantage_fn.ppo_advantage module

PPO’s GAE advantage computation

Ref: https://github.com/volcengine/verl/blob/main/verl/trainer/ppo/core_algos.py

class trinity.algorithm.advantage_fn.ppo_advantage.PPOAdvantageFn(gamma: float = 1.0, lam: float = 1.0)[source]

Bases: AdvantageFn

__init__(gamma: float = 1.0, lam: float = 1.0) → None[source]

classmethod default_args() → Dict[source]

Returns:: The default init arguments for the advantage function.
Return type:: Dict

Other Versions v: main

Tags: v0.1.0; v0.1.1; v0.2.0; v0.2.1

Branches: main (latest)