data_juicer.ops.filter.video_aesthetics_filter module¶
- class data_juicer.ops.filter.video_aesthetics_filter.VideoAestheticsFilter(hf_scorer_model: str = '', trust_remote_code: bool = False, min_score: float = 0.4, max_score: float = 1.0, frame_sampling_method: str = 'uniform', frame_num: Annotated[int, Gt(gt=0)] = 3, any_or_all: str = 'any', reduce_mode: str = 'avg', *args, **kwargs)[source]¶
Bases:
Filter
Filter to keep data samples with aesthetics scores for specified frames in the videos within a specific range.
- __init__(hf_scorer_model: str = '', trust_remote_code: bool = False, min_score: float = 0.4, max_score: float = 1.0, frame_sampling_method: str = 'uniform', frame_num: Annotated[int, Gt(gt=0)] = 3, any_or_all: str = 'any', reduce_mode: str = 'avg', *args, **kwargs)[source]¶
Initialization method.
- Parameters:
hf_scorer_model – Huggingface model name for the aesthetics predictor. By default, we will use ‘shunk031/aesthetics-predictor-v2-sac-logos-ava1-l14-linearMSE’, refer to pypi.org/project/simple-aesthetics-predictor
min_score – Min score for the predicted aesthetics in a video.
max_score – Max score for the predicted aesthetics in a video.
frame_sampling_method – sampling method of extracting frame images from the videos. Should be one of [“all_keyframes”, “uniform”]. The former one extracts all key frames and the latter one extract specified number of frames uniformly from the video. Default: “uniform” with frame_num=3, considering that the number of keyframes can be large while their difference is usually small in terms of their aesthetics.
frame_num – the number of frames to be extracted uniformly from the video. Only works when frame_sampling_method is “uniform”. If it’s 1, only the middle frame will be extracted. If it’s 2, only the first and the last frames will be extracted. If it’s larger than 2, in addition to the first and the last frames, other frames will be extracted uniformly within the video duration.
any_or_all – Keep this sample with ‘any’ or ‘all’ strategy of all videos. ‘any’: keep this sample if any videos meet the condition. ‘all’: keep this sample only if all videos meet the condition.
reduce_mode – reduce mode when one sample corresponds to multiple frames, must be one of [‘avg’,’max’, ‘min’]. ‘avg’: Take the average of multiple values ‘max’: Take the max of multiple values ‘min’: Take the min of multiple values
args – Extra positional arguments.
kwargs – Extra keyword arguments.
- compute_stats_single(sample, rank=None, context=False)[source]¶
Compute stats for the sample which is used as a metric to decide whether to filter this sample.
- Parameters:
sample – input sample.
context – whether to store context information of intermediate vars in the sample temporarily.
- Returns:
sample with computed stats