# video_object_segmenting_mapper Text-guided semantic segmentation of valid objects throughout the video (YOLOE + SAM2). 在整个视频中对有效物体进行文本引导的语义分割(YOLOE + SAM2)。 Type 算子类型: **mapper** Tags 标签: gpu, hf, video ## 🔧 Parameter Configuration 参数配置 | name 参数名 | type 类型 | default 默认值 | desc 说明 | |--------|------|--------|------| | `sam2_hf_model` | | `'facebook/sam2.1-hiera-tiny'` | | | `yoloe_path` | | `'yoloe-11l-seg.pt'` | The path to the YOLOE model. | | `yoloe_conf` | | `0.5` | Confidence threshold for YOLOE object detection. | | `torch_dtype` | | `'bf16'` | The floating point type used for model inference. Can be one of ['fp32', 'fp16', 'bf16']. | | `if_binarize` | | `True` | Whether the final mask requires binarization. If 'if_save_visualization' is set to True, 'if_binarize' will automatically be adjusted to True. | | `if_save_visualization` | | `False` | Whether to save visualization results. | | `save_visualization_dir` | | `DATA_JUICER_ASSETS_CACHE` | The path for saving visualization results. | | `args` | | `''` | | | `kwargs` | | `''` | | ## 📊 Effect demonstration 效果演示 ### test ```python VideoObjectSegmentingMapper(sam2_hf_model='facebook/sam2.1-hiera-tiny', yoloe_path='yoloe-11l-seg.pt', yoloe_conf=0.2, torch_dtype='bf16', if_binarize=True, if_save_visualization=False) ``` #### 📥 input data 输入数据
Sample 1: 1 video
video4.mp4:
main_character_list
['glasses', 'a woman', 'a window']
Sample 2: 1 video
video3.mp4:
main_character_list
['a laptop']
#### 📤 output data 输出数据
Sample 1: empty
segment_data
[673, 3, 1, 360, 480]
cls_id_dict3
object_cls_list
[3]
yoloe_conf_list
[3]
Sample 2: empty
segment_data
[1190, 1, 1, 640, 362]
cls_id_dict1
object_cls_list
[1]
yoloe_conf_list
[1]
## 🔗 related links 相关链接 - [source code 源代码](../../../data_juicer/ops/mapper/video_object_segmenting_mapper.py) - [unit test 单元测试](../../../tests/ops/mapper/test_video_object_segmenting_mapper.py) - [Return operator list 返回算子列表](../../Operators.md)