data_juicer.ops.mapper.video_clip_reassembly_mapper module#

class data_juicer.ops.mapper.video_clip_reassembly_mapper.VideoClipReassemblyMapper(*args, **kwargs)[source]#

Bases: Mapper

Reassemble hand-action results from overlapping video clips.

When long videos are chopped into overlapping clips (e.g. 5 s with 2 s overlap via VideoSplitByDurationMapper), each clip is processed independently through the 3-D motion labelling pipeline. This operator merges the per-clip results back into one unified result per original video, including:

hand_action_tags — states, actions, valid_frame_ids, joints
video_camera_pose_tags — cam_c2w array
hand_reconstruction_hawor_tags — frame_ids converted to global
video_frames — per-clip frame path lists merged into one global list
camera_calibration_moge_tags — per-clip depth/intrinsics merged
clips — replaced with the original video path

Clip global offsets are determined automatically by pixel-matching overlapping frames between consecutive clips, rather than assuming an ideal step size. This handles ffmpeg keyframe-alignment drift that causes actual clip boundaries to differ from the nominal (split_duration - overlap_duration) * fps calculation.

Reference (paper §3.1):: “To enhance efficiency, we chop long videos into overlapping 20-second clips in this stage and recompose their results.”

__init__(hand_action_field: str = 'hand_action_tags', camera_pose_field: str = 'video_camera_pose_tags', hand_reconstruction_field: str = 'hand_reconstruction_hawor_tags', frame_field: str = 'video_frames', moge_field: str = 'camera_calibration_moge_tags', clip_field: str = 'clips', video_key: str = 'videos', split_duration: float = None, overlap_duration: float = None, fps: float = None, *args, **kwargs)[source]#

Base class that conducts data editing.

Parameters:

text_key – the key name of field that stores sample texts to be processed.
image_key – the key name of field that stores sample image list to be processed
audio_key – the key name of field that stores sample audio list to be processed
video_key – the key name of field that stores sample video list to be processed
image_bytes_key – the key name of field that stores sample image bytes list to be processed
query_key – the key name of field that stores sample queries
response_key – the key name of field that stores responses
history_key – the key name of field that stores history of queries and responses

process_single(sample=None, rank=None)[source]#

For sample level, sample –> sample

Parameters:: sample – sample to process
Returns:: processed sample

data_juicer.ops.mapper.video_clip_reassembly_mapper module#

This Page