360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking

Abstract

360deg images can provide an omnidirectional field of view which is important for stable and long-term scene perception. In this paper, we explore 360deg images for visual object tracking and perceive new challenges caused by large distortion, stitching artifacts, and other unique attributes of 360deg images. To alleviate these problems, we take advantage of novel representations of target localization, i.e., bounding field-of-view, and then introduce a general 360 tracking framework that can adopt typical trackers for omnidirectional tracking. More importantly, we propose a new large-scale omnidirectional tracking benchmark dataset, 360VOT, in order to facilitate future research. 360VOT contains 120 sequences with up to 113K high-resolution frames in equirectangular projection. And the tracking targets cover 32 categories in diverse scenarios. Moreover, we provide 4 types of unbiased ground truth, including (rotated) bounding boxes and (rotated) bounding field-of-views, as well as new metrics tailored for 360deg images which allow accurate evaluation of omnidirectional tracking performance. Finally, we extensively evaluated 20 state-of-the-art visual trackers and provided a new baseline for future comparisons. Homepage: https://360vot.hkustvgd.com

Cite

Text

Huang et al. "360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01880

Markdown

[Huang et al. "360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/huang2023iccv-360vot/) doi:10.1109/ICCV51070.2023.01880

BibTeX

@inproceedings{huang2023iccv-360vot,
  title     = {{360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking}},
  author    = {Huang, Huajian and Xu, Yinzhe and Chen, Yingshu and Yeung, Sai-Kit},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {20566-20576},
  doi       = {10.1109/ICCV51070.2023.01880},
  url       = {https://mlanthology.org/iccv/2023/huang2023iccv-360vot/}
}