trinity.buffer.operators.filters.reward_filter module
- class trinity.buffer.operators.filters.reward_filter.RewardFilter(threshold: float = 0.0)[source]
Bases:
ExperienceOperator
Filter experiences based on the reward value.
Note: This filter assumes that the reward is already calculated and stored in the Experience object.
- process(exps: List[Experience]) Tuple[List[Experience], dict] [source]
Filter experiences based on reward value.
- class trinity.buffer.operators.filters.reward_filter.RewardSTDFilter(threshold: float = 0.0)[source]
Bases:
ExperienceOperator
Filter experiences based on the standard deviation of rewards within each group.
Note: This filter assumes that the reward is already calculated and stored in the Experience object.
- process(exps: List[Experience]) Tuple[List[Experience], dict] [source]
Filter experiences based on reward std.