OVTrack: Open-Vocabulary Multiple Object Tracking

Abstract

The ability to recognize, localize and track dynamic objects in a scene is fundamental to many real-world applications, such as self-driving and robotic systems. Yet, traditional multiple object tracking (MOT) benchmarks rely only on a few object categories that hardly represent the multitude of possible objects that are encountered in the real world. This leaves contemporary MOT methods limited to a small set of pre-defined object categories. In this paper, we address this limitation by tackling a novel task, open-vocabulary MOT, that aims to evaluate tracking beyond pre-defined training categories. We further develop OVTrack, an open-vocabulary tracker that is capable of tracking arbitrary object classes. Its design is based on two key ingredients: First, leveraging vision-language models for both classification and association via knowledge distillation; second, a data hallucination strategy for robust appearance feature learning from denoising diffusion probabilistic models. The result is an extremely data-efficient open-vocabulary tracker that sets a new state-of-the-art on the large-scale, large-vocabulary TAO benchmark, while being trained solely on static images. The project page is at https://www.vis.xyz/pub/ovtrack/.

Cite

Text

Li et al. "OVTrack: Open-Vocabulary Multiple Object Tracking." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00539

Markdown

[Li et al. "OVTrack: Open-Vocabulary Multiple Object Tracking." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-ovtrack/) doi:10.1109/CVPR52729.2023.00539

BibTeX

@inproceedings{li2023cvpr-ovtrack,
  title     = {{OVTrack: Open-Vocabulary Multiple Object Tracking}},
  author    = {Li, Siyuan and Fischer, Tobias and Ke, Lei and Ding, Henghui and Danelljan, Martin and Yu, Fisher},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {5567-5577},
  doi       = {10.1109/CVPR52729.2023.00539},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-ovtrack/}
}