pose_estimator¶

Role: Turns 2D detections (YOLO masks, image-matching keypoints) into 3D poses via PnP + RANSAC, and broadcasts the result as a TF. The TF broadcast is what cluster_tf consumes. Not pose topics.

Per-task estimator nodes¶

Each RoboSub task gets its own estimator:

Node	Source 2D data	Output frame example
`gate_pose_estimator_node`	YOLO polyline keypoints	`gate/front`
`points_pose_estimator_node`	Image-matching point correspondences	template-specific
`slalom_pose_estimator_node`	YOLO + depth	per-pole
`bin_pose_estimator_node`	YOLO	`bin/yolo`, `bin/centre/view`, …
`torpedo_pose_estimator_node`	YOLO + correspondences	`torpedo/yolo`, `torpedo_1/{fish,shark}/view`, …
`trash_pose_estimator_*.py`	YOLO	per-object

Base class¶

All inherit from utils/pose_estimator_node.py:PoseEstimatorTransformPubNode, which provides the tf2_ros.TransformBroadcaster plumbing.

Don't use PoseEstimatorPosePubNode for tasks that feed cluster_tf

cluster_tf only consumes TFs. If you inherit from PoseEstimatorPosePubNode (pose-topic publisher) instead of PoseEstimatorTransformPubNode, cluster_tf collects zero samples and the BT stalls in the search leg.

ROS interfaces (typical)¶

Direction	Topic / TF	Type
Sub	`input_detections_topic` (default `yolo/detections`)	`yolo_msgs/DetectionArray`
Sub	`camera_info_topic`	`sensor_msgs/CameraInfo` (required at startup)
Broadcast	TF: `object_frame_id` under camera optical frame	`geometry_msgs/TransformStamped`
Pub (optional)	`{object_frame_id}/pose`	`PoseWithCovarianceStamped`

Parameters: object_frame_id, input_detections_topic, camera_info_topic, from_front, is_image_rectified.

Algorithm¶

Match detection keypoints to 3D object points (from config/object_points.py).
cv2.solvePnPRansac for an initial pose.
Refine on RANSAC inliers.
Optional homography filter (planar targets like gates).
Broadcast TF; optionally publish pose with covariance.

Gotchas¶

Frame IDs: the broadcast TF's parent is inferred from camera_info; child is the object_frame_id parameter. Both must match what cluster_tf is configured to lookup.
At least 3 keypoints are required per detection. The gate estimator drops detections with fewer.
Image encoding: BGR8 by inheritance from the YOLO chain.