TAPVid-3D: A Benchmark for Tracking Any Point in 3D

Abstract

We introduce a new benchmark, TAPVid-3D, for evaluating the task of long-range Tracking Any Point in 3D (TAP-3D). While point tracking in two dimensions (TAP-2D) has many benchmarks measuring performance on real-world videos, such as TAPVid-DAVIS, three-dimensional point tracking has none. To this end, leveraging existing footage, we build a new benchmark for 3D point tracking featuring 4,000+ real-world videos, composed of three different data sources spanning a variety of object types, motion patterns, and indoor and outdoor environments. To measure performance on the TAP-3D task, we formulate a collection of metrics that extend the Jaccard-based metric used in TAP-2D to handle the complexities of ambiguous depth scales across models, occlusions, and multi-track spatio-temporal smoothness. We manually verify a large sample of trajectories to ensure correct video annotations, and assess the current state of the TAP-3D task by constructing competitive baselines using existing tracking models. We anticipate this benchmark will serve as a guidepost to improve our ability to understand precise 3D motion and surface deformation from monocular video.

Cite

Text

Koppula et al. "TAPVid-3D: A Benchmark for Tracking Any Point in 3D." Neural Information Processing Systems, 2024. doi:10.52202/079017-2611

Markdown

[Koppula et al. "TAPVid-3D: A Benchmark for Tracking Any Point in 3D." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/koppula2024neurips-tapvid3d/) doi:10.52202/079017-2611

BibTeX

@inproceedings{koppula2024neurips-tapvid3d,
  title     = {{TAPVid-3D: A Benchmark for Tracking Any Point in 3D}},
  author    = {Koppula, Skanda and Rocco, Ignacio and Yang, Yi and Heyward, Joe and Carreira, João and Zisserman, Andrew and Brostow, Gabriel and Doersch, Carl},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2611},
  url       = {https://mlanthology.org/neurips/2024/koppula2024neurips-tapvid3d/}
}