Learning Correspondence from the Cycle-Consistency of Time
Abstract
We introduce a self-supervised method for learning visual correspondence from unlabeled video. The main idea is to use cycle-consistency in time as free supervisory signal for learning visual representations from scratch. At training time, our model learns a feature map representation to be useful for performing cycle-consistent tracking. At test time, we use the acquired representation to find nearest neighbors across space and time. We demonstrate the generalizability of the representation -- without finetuning -- across a range of visual correspondence tasks, including video object segmentation, keypoint tracking, and optical flow. Our approach outperforms previous self-supervised methods and performs competitively with strongly supervised methods.
Cite
Text
Wang et al. "Learning Correspondence from the Cycle-Consistency of Time." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00267Markdown
[Wang et al. "Learning Correspondence from the Cycle-Consistency of Time." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/wang2019cvpr-learning-b/) doi:10.1109/CVPR.2019.00267BibTeX
@inproceedings{wang2019cvpr-learning-b,
title = {{Learning Correspondence from the Cycle-Consistency of Time}},
author = {Wang, Xiaolong and Jabri, Allan and Efros, Alexei A.},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2019},
doi = {10.1109/CVPR.2019.00267},
url = {https://mlanthology.org/cvpr/2019/wang2019cvpr-learning-b/}
}