trinity.buffer.operators.filters.reward_filter module

class trinity.buffer.operators.filters.reward_filter.RewardFilter(threshold: float = 0.0)[source]

Bases: ExperienceOperator

Filter experiences based on the reward value.

Note: This filter assumes that the reward is already calculated and stored in the Experience object.

__init__(threshold: float = 0.0)[source]
process(exps: List[Experience]) Tuple[List[Experience], dict][source]

Filter experiences based on reward value.

class trinity.buffer.operators.filters.reward_filter.RewardSTDFilter(threshold: float = 0.0)[source]

Bases: ExperienceOperator

Filter experiences based on the standard deviation of rewards within each group.

Note: This filter assumes that the reward is already calculated and stored in the Experience object.

__init__(threshold: float = 0.0)[source]
process(exps: List[Experience]) Tuple[List[Experience], dict][source]

Filter experiences based on reward std.