data_juicer.ops.mapper.video_hand_reconstruction_mapper module#

class data_juicer.ops.mapper.video_hand_reconstruction_mapper.VideoHandReconstructionMapper(wilor_model_path: str = 'wilor_final.ckpt', wilor_model_config: str = 'model_config.yaml', detector_model_path: str = 'detector.pt', mano_right_path: str = 'path_to_mano_right_pkl', frame_num: Annotated[int, Gt(gt=0)] = 3, duration: float = 0, batch_size: int = 16, tag_field_name: str = 'hand_reconstruction_tags', frame_dir: str = '/home/runner/.cache/data_juicer/assets', if_save_visualization: bool = True, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', if_save_mesh: bool = True, save_mesh_dir: str = '/home/runner/.cache/data_juicer/assets', *args, **kwargs)[source]#

Bases: Mapper

Use the WiLoR model for hand localization and reconstruction.

__init__(wilor_model_path: str = 'wilor_final.ckpt', wilor_model_config: str = 'model_config.yaml', detector_model_path: str = 'detector.pt', mano_right_path: str = 'path_to_mano_right_pkl', frame_num: Annotated[int, Gt(gt=0)] = 3, duration: float = 0, batch_size: int = 16, tag_field_name: str = 'hand_reconstruction_tags', frame_dir: str = '/home/runner/.cache/data_juicer/assets', if_save_visualization: bool = True, save_visualization_dir: str = '/home/runner/.cache/data_juicer/assets', if_save_mesh: bool = True, save_mesh_dir: str = '/home/runner/.cache/data_juicer/assets', *args, **kwargs)[source]#

Initialization method.

Parameters:
  • wilor_model_path – The path to ‘wilor_final.ckpt’.

  • wilor_model_config – The path to ‘model_config.yaml’ for the WiLOR model.

  • detector_model_path – The path to ‘detector.pt’ for the WiLOR model.

  • mano_right_path – The path to ‘MANO_RIGHT.pkl’. Users need to download this file from https://mano.is.tue.mpg.de/ and comply with the MANO license.

  • frame_num – The number of frames to be extracted uniformly from the video. If it’s 1, only the middle frame will be extracted. If it’s 2, only the first and the last frames will be extracted. If it’s larger than 2, in addition to the first and the last frames, other frames will be extracted uniformly within the video duration. If “duration” > 0, frame_num is the number of frames per segment.

  • duration – The duration of each segment in seconds. If 0, frames are extracted from the entire video. If duration > 0, the video is segmented into multiple segments based on duration, and frames are extracted from each segment.

  • batch_size – Batch size for simultaneous hand inference.

  • tag_field_name – The field name to store the tags. It’s “hand_reconstruction_tags” in default.

  • frame_dir – Output directory to save extracted frames.

  • if_save_visualization – Whether to save overlay images.

  • save_visualization_dir – The path for saving overlay images.

  • if_save_mesh – Whether to save images of the hand mesh.

  • save_mesh_dir – The path for saving images of the hand mesh.

  • args – extra args

  • kwargs – extra args

project_full_img(points, cam_trans, focal_length, img_res)[source]#
process_single(sample=None, rank=None)[source]#

For sample level, sample –> sample

Parameters:

sample – sample to process

Returns:

processed sample