trinity
Subpackages
- trinity.algorithm
- Subpackages
- trinity.algorithm.advantage_fn
- Submodules
- trinity.algorithm.advantage_fn.advantage_fn module
- trinity.algorithm.advantage_fn.grpo_advantage module
- trinity.algorithm.advantage_fn.opmd_advantage module
- trinity.algorithm.advantage_fn.ppo_advantage module
- trinity.algorithm.advantage_fn.reinforce_plus_plus_advantage module
- trinity.algorithm.advantage_fn.remax_advantage module
- trinity.algorithm.advantage_fn.rloo_advantage module
- Module contents
- trinity.algorithm.entropy_loss_fn
- trinity.algorithm.kl_fn
- trinity.algorithm.policy_loss_fn
- Submodules
- trinity.algorithm.policy_loss_fn.dpo_loss module
- trinity.algorithm.policy_loss_fn.mix_policy_loss module
- trinity.algorithm.policy_loss_fn.opmd_policy_loss module
- trinity.algorithm.policy_loss_fn.policy_loss_fn module
- trinity.algorithm.policy_loss_fn.ppo_policy_loss module
- trinity.algorithm.policy_loss_fn.sft_loss module
- Module contents
- trinity.algorithm.sample_strategy
- trinity.algorithm.advantage_fn
- Submodules
- trinity.algorithm.algorithm module
- trinity.algorithm.algorithm_manager module
- trinity.algorithm.key_mapper module
- trinity.algorithm.utils module
- Module contents
- Subpackages
- trinity.buffer
- trinity.common
- Subpackages
- trinity.common.models
- trinity.common.rewards
- Submodules
- trinity.common.rewards.accuracy_reward module
- trinity.common.rewards.agents_reward module
- trinity.common.rewards.base module
- trinity.common.rewards.composite_reward module
- trinity.common.rewards.format_reward module
- trinity.common.rewards.human_reward module
- trinity.common.rewards.reward_fn module
- trinity.common.rewards.tool_reward module
- Module contents
- trinity.common.workflows
- Submodules
- trinity.common.config module
FormatConfig
FormatConfig.prompt_type
FormatConfig.prompt_key
FormatConfig.response_key
FormatConfig.messages_key
FormatConfig.chat_template
FormatConfig.system_prompt
FormatConfig.reply_prefix
FormatConfig.reward_fn_key
FormatConfig.workflow_key
FormatConfig.solution_key
FormatConfig.reward_key
FormatConfig.chosen_key
FormatConfig.rejected_key
FormatConfig.label_key
FormatConfig.__init__()
GenerationConfig
StorageConfig
StorageConfig.name
StorageConfig.storage_type
StorageConfig.path
StorageConfig.raw
StorageConfig.split
StorageConfig.subset_name
StorageConfig.format
StorageConfig.index
StorageConfig.wrap_in_ray
StorageConfig.capacity
StorageConfig.default_workflow_type
StorageConfig.default_reward_fn_type
StorageConfig.rollout_args
StorageConfig.workflow_args
StorageConfig.ray_namespace
StorageConfig.algorithm_type
StorageConfig.total_epochs
StorageConfig.task_type
StorageConfig.__init__()
DataPipelineConfig
DataPipelineConfig.input_buffers
DataPipelineConfig.output_buffer
DataPipelineConfig.format
DataPipelineConfig.dj_config_path
DataPipelineConfig.dj_process_desc
DataPipelineConfig.agent_model_name
DataPipelineConfig.agent_model_config
DataPipelineConfig.clean_strategy
DataPipelineConfig.min_size_ratio
DataPipelineConfig.min_priority_score
DataPipelineConfig.priority_weights
DataPipelineConfig.data_dist
DataPipelineConfig.__init__()
DataProcessorConfig
ModelConfig
InferenceModelConfig
InferenceModelConfig.model_path
InferenceModelConfig.engine_type
InferenceModelConfig.engine_num
InferenceModelConfig.tensor_parallel_size
InferenceModelConfig.use_v1
InferenceModelConfig.enforce_eager
InferenceModelConfig.enable_prefix_caching
InferenceModelConfig.enable_chunked_prefill
InferenceModelConfig.gpu_memory_utilization
InferenceModelConfig.dtype
InferenceModelConfig.seed
InferenceModelConfig.max_prompt_tokens
InferenceModelConfig.max_response_tokens
InferenceModelConfig.chat_template
InferenceModelConfig.enable_thinking
InferenceModelConfig.enable_openai_api
InferenceModelConfig.bundle_indices
InferenceModelConfig.__init__()
AlgorithmConfig
AlgorithmConfig.algorithm_type
AlgorithmConfig.repeat_times
AlgorithmConfig.sample_strategy
AlgorithmConfig.sample_strategy_args
AlgorithmConfig.advantage_fn
AlgorithmConfig.advantage_fn_args
AlgorithmConfig.kl_penalty_fn
AlgorithmConfig.kl_penalty_fn_args
AlgorithmConfig.policy_loss_fn
AlgorithmConfig.policy_loss_fn_args
AlgorithmConfig.kl_loss_fn
AlgorithmConfig.kl_loss_fn_args
AlgorithmConfig.entropy_loss_fn
AlgorithmConfig.entropy_loss_fn_args
AlgorithmConfig.use_token_level_loss
AlgorithmConfig.__init__()
ClusterConfig
ExplorerInput
TrainerInput
BufferConfig
BufferConfig.batch_size
BufferConfig.total_epochs
BufferConfig.explorer_input
BufferConfig.explorer_output
BufferConfig.trainer_input
BufferConfig.max_retry_times
BufferConfig.max_retry_interval
BufferConfig.read_batch_size
BufferConfig.tokenizer_path
BufferConfig.pad_token_id
BufferConfig.cache_dir
BufferConfig.__init__()
ExplorerConfig
TrainerConfig
MonitorConfig
SynchronizerConfig
Config
Config.mode
Config.project
Config.name
Config.checkpoint_root_dir
Config.checkpoint_job_dir
Config.ray_namespace
Config.algorithm
Config.data_processor
Config.model
Config.cluster
Config.buffer
Config.explorer
Config.trainer
Config.monitor
Config.synchronizer
Config.save()
Config.check_and_update()
Config.__init__()
load_config()
- trinity.common.constants module
- trinity.common.experience module
Experience
Experience.tokens
Experience.prompt_length
Experience.logprobs
Experience.reward
Experience.prompt_text
Experience.response_text
Experience.action_mask
Experience.chosen
Experience.rejected
Experience.info
Experience.metrics
Experience.run_id
Experience.serialize()
Experience.deserialize()
Experience.to_dict()
Experience.__init__()
Experiences
- trinity.common.schema module
RftDatasetModel
RftDatasetModel.id
RftDatasetModel.consumed_cnt
RftDatasetModel.last_modified_date
RftDatasetModel.from_id
RftDatasetModel.from_model
RftDatasetModel.from_recipe
RftDatasetModel.prompt
RftDatasetModel.response
RftDatasetModel.solution
RftDatasetModel.reward
RftDatasetModel.chosen
RftDatasetModel.rejected
RftDatasetModel.label
RftDatasetModel.quality_score
RftDatasetModel.quality_score_detail
RftDatasetModel.difficulty_score
RftDatasetModel.difficulty_score_detail
RftDatasetModel.diversity_score
RftDatasetModel.diversity_score_detail
RftDatasetModel.priority
RftDatasetModel.reward_fn
RftDatasetModel.workflow
RftDatasetModel.to_dict()
RftDatasetModel.__init__()
TaskModel
ExperienceModel
SFTDataModel
DPODataModel
- trinity.common.verl_config module
Data
ActorModel
Optim
WrapPolicy
FSDPConfig
Checkpoint
Actor
Actor.strategy
Actor.ppo_mini_batch_size
Actor.ppo_micro_batch_size
Actor.ppo_micro_batch_size_per_gpu
Actor.use_dynamic_bsz
Actor.ppo_max_token_len_per_gpu
Actor.grad_clip
Actor.ppo_epochs
Actor.shuffle
Actor.ulysses_sequence_parallel_size
Actor.checkpoint
Actor.optim
Actor.fsdp_config
Actor.loss_agg_mode
Actor.clip_ratio
Actor.entropy_coeff
Actor.use_kl_loss
Actor.kl_loss_coef
Actor.kl_loss_type
Actor.__init__()
Ref
Rollout
ActorRolloutRef
CriticModel
Critic
Critic.strategy
Critic.optim
Critic.model
Critic.ppo_mini_batch_size
Critic.ppo_micro_batch_size
Critic.ppo_micro_batch_size_per_gpu
Critic.forward_micro_batch_size
Critic.forward_micro_batch_size_per_gpu
Critic.use_dynamic_bsz
Critic.ppo_max_token_len_per_gpu
Critic.forward_max_token_len_per_gpu
Critic.ulysses_sequence_parallel_size
Critic.ppo_epochs
Critic.shuffle
Critic.grad_clip
Critic.cliprange_value
Critic.checkpoint
Critic.rollout_n
Critic.loss_agg_mode
Critic.__init__()
RewardModel
CustomRewardFunction
KL_Ctrl
Algorithm
Trainer
Trainer.balance_batch
Trainer.total_epochs
Trainer.total_training_steps
Trainer.project_name
Trainer.experiment_name
Trainer.logger
Trainer.val_generations_to_log_to_wandb
Trainer.nnodes
Trainer.n_gpus_per_node
Trainer.save_freq
Trainer.resume_mode
Trainer.resume_from_path
Trainer.test_freq
Trainer.critic_warmup
Trainer.default_hdfs_dir
Trainer.remove_previous_ckpt_in_save
Trainer.del_local_ckpt_after_load
Trainer.default_local_dir
Trainer.val_before_train
Trainer.training_rollout_mode
Trainer.enable_exp_buffer
Trainer.sync_freq
Trainer.sft_warmup_steps
Trainer.max_actor_ckpt_to_keep
Trainer.max_critic_ckpt_to_keep
Trainer.__init__()
veRLConfig
load_config()
- Module contents
- Subpackages
- trinity.explorer
- Submodules
- trinity.explorer.explorer module
Explorer
Explorer.__init__()
Explorer.setup_weight_sync_group()
Explorer.prepare()
Explorer.get_weight()
Explorer.explore()
Explorer.explore_step()
Explorer.need_sync()
Explorer.eval()
Explorer.benchmark()
Explorer.wait_for_workflow_done()
Explorer.sync_weight()
Explorer.running_status()
Explorer.flush_log()
Explorer.shutdown()
- trinity.explorer.runner_pool module
- trinity.explorer.workflow_runner module
- Module contents
Explorer
Explorer.__init__()
Explorer.setup_weight_sync_group()
Explorer.prepare()
Explorer.get_weight()
Explorer.explore()
Explorer.explore_step()
Explorer.need_sync()
Explorer.eval()
Explorer.benchmark()
Explorer.wait_for_workflow_done()
Explorer.sync_weight()
Explorer.running_status()
Explorer.flush_log()
Explorer.shutdown()
RunnerPool
- trinity.manager
- Subpackages
- trinity.manager.config_registry
- Submodules
- trinity.manager.config_registry.algorithm_config_manager module
- trinity.manager.config_registry.buffer_config_manager module
- trinity.manager.config_registry.config_registry module
- trinity.manager.config_registry.explorer_config_manager module
- trinity.manager.config_registry.model_config_manager module
- trinity.manager.config_registry.trainer_config_manager module
- Module contents
- trinity.manager.config_registry
- Submodules
- trinity.manager.config_manager module
- trinity.manager.manager module
- Module contents
- Subpackages
- trinity.plugins
- trinity.trainer
- Subpackages
- Submodules
- trinity.trainer.trainer module
- trinity.trainer.verl_trainer module
VerlPPOTrainerWrapper
VerlPPOTrainerWrapper.__init__()
VerlPPOTrainerWrapper.init_workers()
VerlPPOTrainerWrapper.reset_experiences_example_table()
VerlPPOTrainerWrapper.train_step_num
VerlPPOTrainerWrapper.prepare()
VerlPPOTrainerWrapper.train_step()
VerlPPOTrainerWrapper.save_checkpoint()
VerlPPOTrainerWrapper.sync_weight()
VerlPPOTrainerWrapper.sft_to_rft()
VerlPPOTrainerWrapper.shutdown()
- Module contents
- trinity.utils
- Submodules
- trinity.utils.distributed module
- trinity.utils.dlc_utils module
- trinity.utils.eval_utils module
- trinity.utils.log module
- trinity.utils.monitor module
- trinity.utils.plugin_loader module
- trinity.utils.registry module
- trinity.utils.timer module
- Module contents
Module contents
Trinity-RFT (Reinforcement Fine-Tuning)