data_juicer.ops.mapper.mllm_mapper module

class data_juicer.ops.mapper.mllm_mapper.MllmMapper(hf_model: str = 'llava-hf/llava-v1.6-vicuna-7b-hf', max_new_tokens=256, temperature=0.2, top_p=None, num_beams=1, *args, **kwargs)[source]

Bases: Mapper

Mapper to use MLLMs for visual question answering tasks. Recommended model list: [

llava-hf/llava-v1.6-vicuna-7b-hf, Qwen/Qwen2-VL-7B-Instruct,

]

__init__(hf_model: str = 'llava-hf/llava-v1.6-vicuna-7b-hf', max_new_tokens=256, temperature=0.2, top_p=None, num_beams=1, *args, **kwargs)[source]

Initialization method. :param hf_model: hugginface model id. :param max_new_tokens: the maximum number of new tokens

generated by the model.

Parameters:
  • temperature – used to control the randomness of generated text. The higher the temperature, the more random and creative the generated text will be.

  • top_p – randomly select the next word from the group of words whose cumulative probability reaches p.

  • num_beams – the larger the beam search size, the higher the quality of the generated text.

  • args – extra args

  • kwargs – extra args

process_single(sample=None, rank=None)[source]

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample