data_juicer.ops.mapper.video_object_segmenting_mapper module#

class data_juicer.ops.mapper.video_object_segmenting_mapper.VideoObjectSegmentingMapper(sam2_hf_model: str = 'facebook/sam2.1-hiera-tiny', yoloe_path: str = 'yoloe-11l-seg.pt', yoloe_conf: float = 0.5, torch_dtype: str = 'bf16', if_binarize: bool = True, if_save_visualization: bool = False, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', *args, **kwargs)[source]#

Bases: Mapper

Text-guided semantic segmentation of valid objects throughout the video (YOLOE + SAM2).

__init__(sam2_hf_model: str = 'facebook/sam2.1-hiera-tiny', yoloe_path: str = 'yoloe-11l-seg.pt', yoloe_conf: float = 0.5, torch_dtype: str = 'bf16', if_binarize: bool = True, if_save_visualization: bool = False, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', *args, **kwargs)[source]#

Initialization method.

Parameters:
  • hf_model – Hugginface model id of SAM2.

  • yoloe_path – The path to the YOLOE model.

  • yoloe_conf – Confidence threshold for YOLOE object detection.

  • torch_dtype – The floating point type used for model inference. Can be one of [‘fp32’, ‘fp16’, ‘bf16’].

  • if_binarize – Whether the final mask requires binarization. If ‘if_save_visualization’ is set to True, ‘if_binarize’ will automatically be adjusted to True.

  • if_save_visualization – Whether to save visualization results.

  • save_visualization_dir – The path for saving visualization results.

process_single(sample=None, rank=None)[source]#

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample