data_juicer.ops.mapper.video_camera_pose_megasam_mapper module#

data_juicer.ops.mapper.video_camera_pose_megasam_mapper.to_standard_list(obj)[source]#
class data_juicer.ops.mapper.video_camera_pose_megasam_mapper.VideoCameraPoseMegaSaMMapper(*args, **kwargs)[source]#

Bases: Mapper

Extract camera poses by leveraging MegaSaM and MoGe-2.

__init__(tag_field_name: str = 'video_camera_pose_tags', frame_field: str = 'video_frames', camera_calibration_field: str = 'camera_calibration', max_frames: int = 1000, droid_buffer: int = 1024, save_dir: str = None, use_prepare_env: bool = False, *args, **kwargs)[source]#

Initialization method. :param tag_field_name: The field name to store the tags. It’s “video_camera_pose_tags” in default. :param frame_field: The field name where the video frames are stored. :param camera_calibration_field: The field name where the camera calibration info is stored. :param max_frames: Maximum number of frames to save. :param droid_buffer: DROID SLAM pre-allocated frame buffer size.

Controls GPU memory usage — each buffer slot pre-allocates correlation volumes on GPU. Default 1024, sufficient for clips up to ~100 frames. Reduce for shorter clips to save VRAM, increase for longer videos.

Parameters:
  • save_dir – Directory to save large numpy arrays (depth, cam_c2w) as .npy files instead of storing them inline. When set, tag_dict stores file paths (strings) instead of numpy arrays, which avoids memory limit.

  • use_prepare_env – Whether to prepare the environment.

  • args – extra args

  • kwargs – extra args

process_single(sample=None, rank=None)[source]#

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample

class data_juicer.ops.mapper.video_camera_pose_megasam_mapper.droid_args(image_size, buffer=1024)[source]#

Bases: object

__init__(image_size, buffer=1024)[source]#