Tracklet[1] is a novel method for optimizing
tracklet consistency
, which directly takes theprediction errors
into account. There are some details of reading and implementing it.
Contents
Paper & Code & note
Paper: Multi-object Tracking via End-to-end Tracklet Searching and Ranking(arXiv 2020 paper)
Code: [Pytorch][Updating]
Note: Tracklet
Paper
Abstract
- Recent work use sequence model to calculate the
similarity score
between the detections and the previous tracklets, but the forced exposure to ground-truth in the training stage leads to thetraining-inference discrepancy
problem.- This paper directly takes the
prediction errors
into account to optimize tracklet consistency.- It havs achieved
state-of-the-art
in MOT15-17 challenge benchmarks using public detection and online settings.
Problem Description
- pairwise-detection matching based on affinity model: It has limited capability to associate
long-term
consistent trajectories.- affinity model on sequence model: tracklet representative feature for
matching
can somewhat be ill-posed and ideal assumption brings up a potentialvulnerability
.
Problem Solution
- They propose a global score to measure the inner
appearance consistency
of tracklet.- It optimizes the whole tracklet with a
margin loss
.- a novel algorithm has been established to
simulate the prediction data distribution on training
by introducing realistic discombobulated candidates to model.
Conceptual Understanding
- Tracklet-level based tracking: It constructs an affinity model on the
tracklet level
and then uses it to associate the tracklet with detection or connect short tracklets.- Pair-wise association methods: They establish an affinity model on the
isolated detections
, and then generate tracking results from the bottom up.
The common concern of these two types of methods is toguarantee the consistency of the entire associated trajectories
.
Core Conception
- Training procedure: It follows a “searching-learning-ranking-pruning” pipeline.
- Scoring Network: The
appearance feature
of each detection are extracted with CNN(ResNet-50), and theappearance embedding
of tracklet are obtained through encoder(LSTM).- It trained by online hypothesis tracklet searching with
margin loss
andrank loss
, details as follow.
Experiments
They report the quantitative results on the three datasets in MOT Challenge Benchmark.
Code
Note
More details of Tracklet optimization and the like can be found in [2].
References
[1] Hu, T., Huang, L., & Shen, H. (2020). Multi-object Tracking via End-to-end Tracklet Searching and Ranking. arXiv preprint arXiv:2003.02795.
[2] Change_ZH. “Tracklet: MOT Scoring Network.” https://blog.csdn.net/qq_36449741/article/details/104815321?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task.