data_juicer.ops.mapper.video_extract_frames_mapper module¶
- class data_juicer.ops.mapper.video_extract_frames_mapper.VideoExtractFramesMapper(frame_sampling_method: str = 'all_keyframes', frame_num: Annotated[int, Gt(gt=0)] = 3, duration: float = 0, frame_dir: str = None, frame_key='video_frames', *args, **kwargs)[source]¶
Bases:
Mapper
Mapper to extract frames from video files according to specified methods. Extracted Frames Data Format:
The data format for the extracted frames is a dictionary mapping video key to extracted frames directory where the extracted frames are saved. The dictionary follows the structure: {
“video_key_1”: “/${frame_dir}/video_key_1_filename/”, “video_key_2”: “/${frame_dir}/video_key_2_filename/”, …
}
- __init__(frame_sampling_method: str = 'all_keyframes', frame_num: Annotated[int, Gt(gt=0)] = 3, duration: float = 0, frame_dir: str = None, frame_key='video_frames', *args, **kwargs)[source]¶
Initialization method. :param frame_sampling_method: sampling method of extracting frame
videos from the videos. Should be one of [“all_keyframes”, “uniform”]. The former one extracts all key frames (the number of which depends on the duration of the video) and the latter one extract specified number of frames uniformly from the video. If “duration” > 0, frame_sampling_method acts on every segment. Default: “all_keyframes”.
- Parameters:
frame_num – the number of frames to be extracted uniformly from the video. Only works when frame_sampling_method is “uniform”. If it’s 1, only the middle frame will be extracted. If it’s 2, only the first and the last frames will be extracted. If it’s larger than 2, in addition to the first and the last frames, other frames will be extracted uniformly within the video duration. If “duration” > 0, frame_num is the number of frames per segment.
duration – The duration of each segment in seconds. If 0, frames are extracted from the entire video. If duration > 0, the video is segmented into multiple segments based on duration, and frames are extracted from each segment.
frame_dir – Output directory to save extracted frames. If None, a default directory based on the video file path is used.
frame_key – The name of field to save generated frames info.
args – extra args
kwargs – extra args