data_juicer.ops.mapper.image_segment_mapper module

class data_juicer.ops.mapper.image_segment_mapper.ImageSegmentMapper(imgsz=1024, conf=0.05, iou=0.5, model_path='FastSAM-x.pt', *args, **kwargs)[source]

Bases: Mapper

Perform segment-anything on images and return the bounding boxes.

This operator uses a FastSAM model to detect and segment objects in images, returning their bounding boxes. It processes each image in the sample, and stores the bounding boxes in the ‘bbox_tag’ field under the ‘meta’ key. If no images are present in the sample, an empty array is stored instead. The operator allows setting the image resolution, confidence threshold, and IoU (Intersection over Union) score threshold for the segmentation process. Bounding boxes are represented as N x M x 4 arrays, where N is the number of images, M is the number of detected boxes, and 4 represents the coordinates.

__init__(imgsz=1024, conf=0.05, iou=0.5, model_path='FastSAM-x.pt', *args, **kwargs)[source]

Initialization method.

Parameters:
  • imgsz – resolution for image resizing

  • conf – confidence score threshold

  • iou – IoU (Intersection over Union) score threshold

  • model_path – the path to the FastSAM model. Model name should be one of [‘FastSAM-x.pt’, ‘FastSAM-s.pt’].

process_single(sample, rank=None, context=False)[source]

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample