Jan. 26, 2022, 2:10 a.m. | Tianyu Zhu, Markus Hiller, Mahsa Ehsanpour, Rongkai Ma, Tom Drummond, Hamid Rezatofighi

Tracking a time-varying indefinite number of objects in a video sequence over
time remains a challenge despite recent advances in the field. Ignoring
long-term temporal information, most existing approaches are not able to
properly handle multi-object tracking challenges such as occlusion. To address
these shortcomings, we present MO3TR: a truly end-to-end Transformer-based
online multi-object tracking (MOT) framework that learns to handle occlusions,
track initiation and termination without the need for an explicit data
association module or any heuristics/post-processing. MO3TR encodes …

