data_juicer.ops.mapper.video_camera_pose_mapper module#

class data_juicer.ops.mapper.video_camera_pose_mapper.VideoCameraPoseMapper(*args, **kwargs)[source]#

Bases: Mapper

Extract camera poses by leveraging MegaSaM and MoGe-2.

__init__(moge_model_path: str = 'Ruicheng/moge-2-vitl', frame_num: Annotated[int, Gt(gt=0)] = 3, duration: float = 0, tag_field_name: str = 'video_camera_pose_tags', frame_dir: str = '/home/runner/.cache/data_juicer/assets', if_output_moge_info: bool = False, moge_output_info_dir: str = '/home/runner/.cache/data_juicer/assets', if_save_info: bool = True, output_info_dir: str = '/home/runner/.cache/data_juicer/assets', max_frames: int = 1000, *args, **kwargs)[source]#

Initialization method.

Parameters:
  • moge_model_path – The path to the Moge-2 model.

  • frame_num – The number of frames to be extracted uniformly from the video. If it’s 1, only the middle frame will be extracted. If it’s 2, only the first and the last frames will be extracted. If it’s larger than 2, in addition to the first and the last frames, other frames will be extracted uniformly within the video duration. If “duration” > 0, frame_num is the number of frames per segment.

  • duration – The duration of each segment in seconds. If 0, frames are extracted from the entire video. If duration > 0, the video is segmented into multiple segments based on duration, and frames are extracted from each segment.

  • tag_field_name – The field name to store the tags. It’s “video_camera_pose_tags” in default.

  • frame_dir – Output directory to save extracted frames.

  • if_output_moge_info – Whether to save the results from MoGe-2 to an JSON file.

  • moge_output_info_dir – Output directory for saving camera parameters.

  • if_save_info – Whether to save the results to an npz file.

  • output_info_dir – Path for saving the results.

  • max_frames – Maximum number of frames to save.

  • args – extra args

  • kwargs – extra args

image_stream(frames_path, depth_list, intrinsics_list)[source]#
process_single(sample=None, rank=None)[source]#

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample

class data_juicer.ops.mapper.video_camera_pose_mapper.droid_args(image_size)[source]#

Bases: object

__init__(image_size)[source]#