data_juicer.ops.mapper.video_trajectory_overlay_mapper module#
- class data_juicer.ops.mapper.video_trajectory_overlay_mapper.VideoTrajectoryOverlayMapper(*args, **kwargs)[source]#
Bases:
MapperPrepare VLM-ready frames by sampling and overlaying hand trajectories.
Implements the visualization step from paper https://arxiv.org/pdf/2510.21571:
“From each segment, we evenly sample 8 frames and highlight hand trajectories on each frame by projecting the world-space trajectory of the hand palm from the current frame to the end of the clip.”
For each atomic action segment (output of
VideoAtomicActionSegmentMapper), this operator:Evenly samples
n_sample_framesframes from the segment.For each sampled frame, projects the future world-space wrist trajectory (from the current frame to the end of the segment) onto the image using camera intrinsics and cam_c2w.
Draws the trajectory as a colored line with a dot at the current wrist position.
Saves the overlay images and stores their paths in the segment.
The output is written back into each segment dict under
"overlay_frames", ready to be consumed by the VLM captioning operator.- PALM_JOINT_INDEX = 9#
- __init__(segment_field: str = 'atomic_action_segments', camera_pose_field: str = 'video_camera_pose_tags', moge_field: str = 'camera_calibration_moge_tags', frame_field: str = 'video_frames', save_dir: str = None, n_sample_frames: int = 8, palm_joint_index: int = 9, dot_radius: int = 10, line_thickness: int = 4, trajectory_alpha: float = 0.7, *args, **kwargs)[source]#
Initialization method.
- Parameters:
segment_field – Meta field storing atomic action segments.
camera_pose_field – Meta field storing camera pose (cam_c2w).
moge_field – Meta field storing MoGe calibration (for fov_x).
frame_field – Field storing frame image paths.
save_dir – Directory to save overlay images. If None, uses a temp directory derived from the first frame path.
n_sample_frames – Number of frames to evenly sample from each segment.
palm_joint_index – MANO joint index for the palm position. Default 9 = middle finger MCP (palm center proxy). Joint 0 = wrist root.
dot_radius – Radius of the dot at the current wrist position.
line_thickness – Thickness of the trajectory line.
trajectory_alpha – Alpha blending for the trajectory overlay.