data_juicer.ops.mapper.detect_main_character_mapper module¶

class data_juicer.ops.mapper.detect_main_character_mapper.DetectMainCharacterMapper(mllm_mapper_args: Dict | None = {}, filter_min_character_num: int = 0, *args, **kwargs)[source]¶

Bases: Mapper

Extract all main character names based on the given image and its caption.

__init__(mllm_mapper_args: Dict | None = {}, filter_min_character_num: int = 0, *args, **kwargs)[source]¶

Initialization.

Parameters:

mllm_mapper_args – Arguments for multimodal language model mapper. Controls the generation of captions for bounding box regions. Default empty dict will use fixed values: max_new_tokens=256, temperature=0.2, top_p=None, num_beams=1, hf_model=”llava-hf/llava-v1.6-vicuna-7b-hf”.
filter_min_character_num – Filters out samples where the number of main characters in the image is less than this threshold.

process_single(samples, rank=None)[source]¶

For sample level, sample –> sample

Parameters:: sample – sample to process
Returns:: processed sample