trinity.common.models.mm_utils module#
“Multi-modal utilities for processing and handling multi-modal data such as images and videos. Only support Qwen2.5 VL series.
Modified from: verl/utils/dataset/rl_dataset.py
“Multi-modal utilities for processing and handling multi-modal data such as images and videos. Only support Qwen2.5 VL series.
Modified from: verl/utils/dataset/rl_dataset.py