Tracking Objects as Points
we present a simultaneous detection and tracking algorithm that is simpler, faster, and more accurate than the state of the art.
CenterTrack, applies a detection model to a pair of images and detections from the prior frame.
CenterTrack localizes objects and predicts their associations with the previous frame.
CenterTrack is simple, online (no peeking into the future), and real-time.
CenterTrack is easily extended to monocular 3D tracking by regressing additional 3D attributes.
Tracking-by-detection, These models rely on a given accurate recognition to identify objects and then link them up through time in a separate stage.
Recent work on simultaneous detection and tracking [1, 8] has made progress in alleviating some of this complexity.
Each object is represented by a single point at the center of its bounding box.
使用CenterNet detector
Specifically, we adopt the recent CenterNet detector to localize object centers.
We condition the detector on two consecutive frames, as well as a heatmap of prior tracklets, represented as points. We train the detector to also output an offset vector from the current object center to its center in the previous frame.
If each object in past frames is represented by a single point, a constellation of objects can be represented by a heatmap of points
point-based tracking simplifies object association across time. A simple displacement prediction, akin to sparse optical flow, allows objects in different frames to be linked.
Tracking-by-detection. 通常来说需要额外的特征提取网络来进行前后帧的数据匹配。检测与追踪相分离。
Joint detection and tracking. 将检测器与追踪器合二为一
Motion prediction. Early approaches [2,47] used Kalman filters to model object velocities. Our center offset prediction is analogous to sparse optical flow, but is learned together with the detection network and does not require dense supervision.
Heatmap-conditioned keypoint estimation. A rendered heatmap of prior keypoints [4, 11, 29, 44] is especially appealing in tracking for two reasons. First, the information in the previous frame is freely available and does not slow down the detector. Second, conditional tracking can reason about occluded objects that may no longer be visible in the current frame. The tracker can simply learn to keep those detections from the prior frame around.
3D object detection and tracking.
Tracking objects as points
- 逐帧找到所有的目标物体,包括被遮挡的物体
- 间前后两帧的目标匹配起来
Tracking-conditioned detection
Association through offsets
Training on video data
Training on static image data
End-to-end 3D object tracking
Simple Unsupervised Multi-Object Tracking -
Multiple People Tracking by Lifted Multicut and Person Re-identification
本文浏览量: 次