data_juicer.ops.mapper.video_hand_reconstruction_hawor_mapper module#

class data_juicer.ops.mapper.video_hand_reconstruction_hawor_mapper.VideoHandReconstructionHaworMapper(*args, **kwargs)[源代码]#

基类:Mapper

Use HaWoR and MoGe-2 for hand reconstruction.

__init__(hawor_model_path: str = 'hawor.ckpt', hawor_config_path: str = 'model_config.yaml', hawor_detector_path: str = 'detector.pt', mano_right_path: str = 'path_to_mano_right_pkl', mano_left_path: str = 'path_to_mano_left_pkl', frame_field: str = 'video_frames', camera_calibration_field: str = 'camera_calibration', tag_field_name: str = 'hand_reconstruction_hawor_tags', thresh: float = 0.2, *args, **kwargs)[源代码]#

Initialization method.

参数:
  • hawor_model_path -- The path to 'hawor.ckpt'. for the HaWoR model.

  • hawor_config_path -- The path to 'model_config.yaml' for the HaWoR model.

  • hawor_detector_path -- The path to 'detector.pt' for the HaWoR model.

  • mano_right_path -- The path to 'MANO_RIGHT.pkl'. Users need to download this file from https://mano.is.tue.mpg.de/ and comply with the MANO license.

  • mano_left_path -- The path to 'MANO_LEFT.pkl'. Users need to download this file from https://mano.is.tue.mpg.de/ and comply with the MANO license. Used for accurate left-hand wrist offset computation (with shapedirs bug-fix).

  • frame_field -- The field name where the video frames are stored.

  • camera_calibration_field -- The field name where the camera calibration info is stored.

  • tag_field_name -- The field name to store the tags. It's "hand_reconstruction_hawor_tags" in default.

  • thresh -- The confidence threshold for hand detection. Default is 0.2.

  • args -- extra args

  • kwargs -- extra args

detect_track(imgfiles: list, hand_det_model, thresh: float = 0.5) tuple[源代码]#

Detects and tracks hands across a sequence of images using YOLO.

参数:
  • imgfiles (list) -- List of image frames.

  • hand_det_model (YOLO) -- The initialized YOLO hand detection model.

  • thresh (float) -- Confidence threshold for detection.

返回:

(list of boxes (unused in original logic), dict of tracks)

返回类型:

tuple

hawor_motion_estimation(imgfiles: list, tracks: dict, model, img_focal: float, frame_file_paths: list, single_image: bool = False) dict[源代码]#

Performs HAWOR 3D hand reconstruction on detected and tracked hand regions.

参数:
  • imgfiles (list) -- List of decoded image frames (numpy arrays).

  • tracks (dict) -- Dictionary mapping track ID to a list of detection objects.

  • model (HAWOR) -- The initialized HAWOR model.

  • img_focal (float) -- Camera focal length.

  • frame_file_paths (list) -- List of file paths readable by HaWoR (pre-materialized on disk if input was bytes).

  • single_image (bool) -- Flag for single-image processing mode.

返回:

Reconstructed parameters ('left' and 'right' hand results).

返回类型:

dict

process_single(sample=None, rank=None)[源代码]#

For sample level, sample --> sample

参数:

sample -- sample to process

返回:

processed sample