data_juicer.ops.mapper.video_depth_estimation_mapper module#

class data_juicer.ops.mapper.video_depth_estimation_mapper.VideoDepthEstimationMapper(video_depth_model_path: str = 'video_depth_anything_vitb.pth', point_cloud_dir_for_metric: str = '/home/runner/.cache/data_juicer/assets', max_res: int = 1280, torch_dtype: str = 'fp16', if_save_visualization: bool = False, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', grayscale: bool = False, *args, **kwargs)[source]#

Bases: Mapper

Perform depth estimation on the video.

__init__(video_depth_model_path: str = 'video_depth_anything_vitb.pth', point_cloud_dir_for_metric: str = '/home/runner/.cache/data_juicer/assets', max_res: int = 1280, torch_dtype: str = 'fp16', if_save_visualization: bool = False, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', grayscale: bool = False, *args, **kwargs)[source]#

Initialization method.

Parameters:
  • video_depth_model_path โ€“ The path to the Video-Depth-Anything model. If the model is a โ€˜metricโ€™ model, the code will automatically switch to metric mode, and the user should input the path for storing point clouds.

  • point_cloud_dir_for_metric โ€“ The path for storing point clouds (for a โ€˜metricโ€™ model).

  • max_res โ€“ The maximum resolution threshold for videos; videos exceeding this threshold will be resized.

  • torch_dtype โ€“ The floating point type used for model inference. Can be one of [โ€˜fp32โ€™, โ€˜fp16โ€™]

  • if_save_visualization โ€“ Whether to save visualization results.

  • save_visualization_dir โ€“ The path for saving visualization results.

  • grayscale โ€“ If True, the colorful palette will not be applied.

process_single(sample=None, rank=None)[source]#

For sample level, sample โ€“> sample

Parameters:

sample โ€“ sample to process

Returns:

processed sample