data_juicer.ops.mapper.video_remove_watermark_mapper module¶
- class data_juicer.ops.mapper.video_remove_watermark_mapper.VideoRemoveWatermarkMapper(roi_strings: List[str] = ['0,0,0.1,0.1'], roi_type: str = 'ratio', roi_key: str | None = None, frame_num: Annotated[int, Gt(gt=0)] = 10, min_frame_threshold: Annotated[int, Gt(gt=0)] = 7, detection_method: str = 'pixel_value', save_dir: str = None, *args, **kwargs)[源代码]¶
基类:
Mapper
Remove the watermarks in videos given regions.
- __init__(roi_strings: List[str] = ['0,0,0.1,0.1'], roi_type: str = 'ratio', roi_key: str | None = None, frame_num: Annotated[int, Gt(gt=0)] = 10, min_frame_threshold: Annotated[int, Gt(gt=0)] = 7, detection_method: str = 'pixel_value', save_dir: str = None, *args, **kwargs)[源代码]¶
Initialization method.
- 参数:
roi_strings -- a given list of regions the watermarks locate. The format of each can be "x1, y1, x2, y2", "(x1, y1, x2, y2)", or "[x1, y1, x2, y2]".
roi_type -- the roi string type. When the type is 'pixel', (x1, y1), (x2, y2) are the locations of pixels in the top left corner and the bottom right corner respectively. If the roi_type is 'ratio', the coordinates are normalized by widths and heights.
roi_key -- the key name of fields in samples to store roi_strings for each sample. It's used for set different rois for different samples. If it's none, use rois in parameter "roi_strings". It's None in default.
frame_num -- the number of frames to be extracted uniformly from the video to detect the pixels of watermark.
min_frame_threshold -- a coordination is considered as the location of a watermark pixel when it is that in no less min_frame_threshold frames.
detection_method -- the method to detect the pixels of watermark. If it is 'pixel_value', we consider the distribution of pixel value in each frame. If it is 'pixel_diversity', we will consider the pixel diversity in different frames. The min_frame_threshold is useless and frame_num must be greater than 1 in 'pixel_diversity' mode.
save_dir -- The directory where generated video files will be stored. If not specified, outputs will be saved in the same directory as their corresponding input files. This path can alternatively be defined by setting the DJ_PRODUCED_DATA_DIR environment variable.
args -- extra args
kwargs -- extra args