data_juicer.ops.mapper.video_object_segmenting_mapper module#

class data_juicer.ops.mapper.video_object_segmenting_mapper.VideoObjectSegmentingMapper(*args, **kwargs)[源代码]#

基类:Mapper

Text-guided semantic segmentation of valid objects throughout the video (YOLOE + SAM2).

__init__(sam2_hf_model: str = 'facebook/sam2.1-hiera-tiny', yoloe_path: str = 'yoloe-11l-seg.pt', yoloe_conf: float = 0.5, torch_dtype: str = 'bf16', if_binarize: bool = True, if_save_visualization: bool = False, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', *args, **kwargs)[源代码]#

Initialization method.

参数:
  • hf_model -- Hugginface model id of SAM2.

  • yoloe_path -- The path to the YOLOE model.

  • yoloe_conf -- Confidence threshold for YOLOE object detection.

  • torch_dtype -- The floating point type used for model inference. Can be one of ['fp32', 'fp16', 'bf16'].

  • if_binarize -- Whether the final mask requires binarization. If 'if_save_visualization' is set to True, 'if_binarize' will automatically be adjusted to True.

  • if_save_visualization -- Whether to save visualization results.

  • save_visualization_dir -- The path for saving visualization results.

process_single(sample=None, rank=None)[源代码]#

For sample level, sample --> sample

参数:

sample -- sample to process

返回:

processed sample