Learning Image Matching by Simply Watching Video
Abstract
This work presents an unsupervised learning based approach to the ubiquitous computer vision problem of image matching. We start from the insight that the problem of frame interpolation implicitly solves for inter-frame correspondences. This permits the application of analysis-by-synthesis: we first train and apply a Convolutional Neural Network for frame interpolation, then obtain correspondences by inverting the learned CNN. The key benefit behind this strategy is that the CNN for frame interpolation can be trained in an unsupervised manner by exploiting the temporal coherence that is naturally contained in real-world video sequences. The present model therefore learns image matching by simply “watching videos”. Besides a promise to be more generally applicable, the presented approach achieves surprising performance comparable to traditional empirically designed methods.
Cite
Text
Long et al. "Learning Image Matching by Simply Watching Video." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46466-4_26Markdown
[Long et al. "Learning Image Matching by Simply Watching Video." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/long2016eccv-learning/) doi:10.1007/978-3-319-46466-4_26BibTeX
@inproceedings{long2016eccv-learning,
title = {{Learning Image Matching by Simply Watching Video}},
author = {Long, Gucan and Kneip, Laurent and Álvarez, José M. and Li, Hongdong and Zhang, Xiaohu and Yu, Qifeng},
booktitle = {European Conference on Computer Vision},
year = {2016},
pages = {434-450},
doi = {10.1007/978-3-319-46466-4_26},
url = {https://mlanthology.org/eccv/2016/long2016eccv-learning/}
}