data_juicer.ops.mapper.video_hand_reconstruction_hawor_mapper module#
- class data_juicer.ops.mapper.video_hand_reconstruction_hawor_mapper.VideoHandReconstructionHaworMapper(*args, **kwargs)[source]#
Bases:
MapperUse HaWoR and MoGe-2 for hand reconstruction.
- __init__(hawor_model_path: str = 'hawor.ckpt', hawor_config_path: str = 'model_config.yaml', hawor_detector_path: str = 'detector.pt', mano_right_path: str = 'path_to_mano_right_pkl', mano_left_path: str = 'path_to_mano_left_pkl', frame_field: str = 'video_frames', camera_calibration_field: str = 'camera_calibration', tag_field_name: str = 'hand_reconstruction_hawor_tags', thresh: float = 0.2, *args, **kwargs)[source]#
Initialization method.
- Parameters:
hawor_model_path – The path to ‘hawor.ckpt’. for the HaWoR model.
hawor_config_path – The path to ‘model_config.yaml’ for the HaWoR model.
hawor_detector_path – The path to ‘detector.pt’ for the HaWoR model.
mano_right_path – The path to ‘MANO_RIGHT.pkl’. Users need to download this file from https://mano.is.tue.mpg.de/ and comply with the MANO license.
mano_left_path – The path to ‘MANO_LEFT.pkl’. Users need to download this file from https://mano.is.tue.mpg.de/ and comply with the MANO license. Used for accurate left-hand wrist offset computation (with shapedirs bug-fix).
frame_field – The field name where the video frames are stored.
camera_calibration_field – The field name where the camera calibration info is stored.
tag_field_name – The field name to store the tags. It’s “hand_reconstruction_hawor_tags” in default.
thresh – The confidence threshold for hand detection. Default is 0.2.
args – extra args
kwargs – extra args
- detect_track(imgfiles: list, hand_det_model, thresh: float = 0.5) tuple[source]#
Detects and tracks hands across a sequence of images using YOLO.
- Parameters:
imgfiles (list) – List of image frames.
hand_det_model (YOLO) – The initialized YOLO hand detection model.
thresh (float) – Confidence threshold for detection.
- Returns:
(list of boxes (unused in original logic), dict of tracks)
- Return type:
tuple
- hawor_motion_estimation(imgfiles: list, tracks: dict, model, img_focal: float, frame_file_paths: list, single_image: bool = False) dict[source]#
Performs HAWOR 3D hand reconstruction on detected and tracked hand regions.
- Parameters:
imgfiles (list) – List of decoded image frames (numpy arrays).
tracks (dict) – Dictionary mapping track ID to a list of detection objects.
model (HAWOR) – The initialized HAWOR model.
img_focal (float) – Camera focal length.
frame_file_paths (list) – List of file paths readable by HaWoR (pre-materialized on disk if input was bytes).
single_image (bool) – Flag for single-image processing mode.
- Returns:
Reconstructed parameters (‘left’ and ‘right’ hand results).
- Return type:
dict