data_juicer.ops.mapper.video_trajectory_overlay_mapper module#

class data_juicer.ops.mapper.video_trajectory_overlay_mapper.VideoTrajectoryOverlayMapper(*args, **kwargs)[源代码]#

基类:Mapper

Prepare VLM-ready frames by sampling and overlaying hand trajectories.

Implements the visualization step from paper https://arxiv.org/pdf/2510.21571:

"From each segment, we evenly sample 8 frames and highlight hand trajectories on each frame by projecting the world-space trajectory of the hand palm from the current frame to the end of the clip."

For each atomic action segment (output of VideoAtomicActionSegmentMapper), this operator:

  1. Evenly samples n_sample_frames frames from the segment.

  2. For each sampled frame, projects the future world-space wrist trajectory (from the current frame to the end of the segment) onto the image using camera intrinsics and cam_c2w.

  3. Draws the trajectory as a colored line with a dot at the current wrist position.

  4. Saves the overlay images and stores their paths in the segment.

The output is written back into each segment dict under "overlay_frames", ready to be consumed by the VLM captioning operator.

PALM_JOINT_INDEX = 9#
__init__(segment_field: str = 'atomic_action_segments', camera_pose_field: str = 'video_camera_pose_tags', moge_field: str = 'camera_calibration_moge_tags', frame_field: str = 'video_frames', save_dir: str = None, n_sample_frames: int = 8, palm_joint_index: int = 9, dot_radius: int = 10, line_thickness: int = 4, trajectory_alpha: float = 0.7, *args, **kwargs)[源代码]#

Initialization method.

参数:
  • segment_field -- Meta field storing atomic action segments.

  • camera_pose_field -- Meta field storing camera pose (cam_c2w).

  • moge_field -- Meta field storing MoGe calibration (for fov_x).

  • frame_field -- Field storing frame image paths.

  • save_dir -- Directory to save overlay images. If None, uses a temp directory derived from the first frame path.

  • n_sample_frames -- Number of frames to evenly sample from each segment.

  • palm_joint_index -- MANO joint index for the palm position. Default 9 = middle finger MCP (palm center proxy). Joint 0 = wrist root.

  • dot_radius -- Radius of the dot at the current wrist position.

  • line_thickness -- Thickness of the trajectory line.

  • trajectory_alpha -- Alpha blending for the trajectory overlay.

process_single(sample=None, rank=None)[源代码]#

For sample level, sample --> sample

参数:

sample -- sample to process

返回:

processed sample