Learning Pixel Trajectories with Multiscale Contrastive Random Walks

Zhangxing Bian, Allan Jabri, Alexei A. Efros, Andrew Owens

CVPR 2022 pp. 6508-6519

doi:10.1109/CVPR52688.2022.00640 /cvpr/2022/bian2022cvpr-learning/

Abstract

A range of video modeling tasks, from optical flow to multiple object tracking, share the same fundamental challenge: establishing space-time correspondence. Yet, approaches that dominate each space differ. We take a step towards bridging this gap by extending the recent contrastive random walk formulation to much more dense, pixel-level space-time graphs. The main contribution is introducing hierarchy into the search problem by computing the transition matrix in a coarse-to-fine manner, forming a multiscale contrastive random walk. This establishes a unified technique for self-supervised learning of optical flow, keypoint tracking, and video object segmentation. Experiments demonstrate that, for each of these tasks, our unified model achieves performance competitive with strong self-supervised approaches specific to that task.

PDF CVPR Semantic Scholar

Cite

Text

Bian et al. "Learning Pixel Trajectories with Multiscale Contrastive Random Walks." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00640

Markdown

[Bian et al. "Learning Pixel Trajectories with Multiscale Contrastive Random Walks." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/bian2022cvpr-learning/) doi:10.1109/CVPR52688.2022.00640

BibTeX

@inproceedings{bian2022cvpr-learning,
  title     = {{Learning Pixel Trajectories with Multiscale Contrastive Random Walks}},
  author    = {Bian, Zhangxing and Jabri, Allan and Efros, Alexei A. and Owens, Andrew},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {6508-6519},
  doi       = {10.1109/CVPR52688.2022.00640},
  url       = {https://mlanthology.org/cvpr/2022/bian2022cvpr-learning/}
}