trinity.algorithm.sample_strategy
Submodules
trinity.algorithm.sample_strategy.mix_sample_strategy module
- class trinity.algorithm.sample_strategy.mix_sample_strategy.MixSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
The default sample strategy.
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
- trinity.algorithm.sample_strategy.mix_sample_strategy.to_data_proto_mix(experiences: Experiences, is_expert_mask: tensor) DataProto [source]
trinity.algorithm.sample_strategy.sample_strategy module
- class trinity.algorithm.sample_strategy.sample_strategy.SampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
ABC
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs) None [source]
- abstract sample(step: int) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.sample_strategy.WarmupSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
The default sample strategy.
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
- sample(step: int, **kwargs) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.sample_strategy.DefaultSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
- sample(step: int, **kwargs) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.sample_strategy.DPOSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
WarmupSampleStrategy
trinity.algorithm.sample_strategy.utils module
- trinity.algorithm.sample_strategy.utils.to_data_proto(experiences: Experiences) DataProto [source]
- trinity.algorithm.sample_strategy.utils.representative_sample(experiences: List[Experience]) List[dict] [source]
Module contents
- class trinity.algorithm.sample_strategy.SampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
ABC
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs) None [source]
- abstract sample(step: int) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.DefaultSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
- sample(step: int, **kwargs) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.WarmupSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
The default sample strategy.
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
- sample(step: int, **kwargs) Tuple[Any, Dict, List] [source]
Sample data from buffer.
- Parameters:
step (int) – The step number of current step.
- Returns:
The sampled data. Dict: Metrics for logging. List: Representative data for logging.
- Return type:
Any
- class trinity.algorithm.sample_strategy.MixSampleStrategy(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]
Bases:
SampleStrategy
The default sample strategy.
- __init__(buffer_config: BufferConfig, trainer_type: str, **kwargs)[source]