Unsupervised Learning of Image Transformations
Abstract
We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.
Cite
Text
Memisevic and Hinton. "Unsupervised Learning of Image Transformations." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383036Markdown
[Memisevic and Hinton. "Unsupervised Learning of Image Transformations." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/memisevic2007cvpr-unsupervised/) doi:10.1109/CVPR.2007.383036BibTeX
@inproceedings{memisevic2007cvpr-unsupervised,
title = {{Unsupervised Learning of Image Transformations}},
author = {Memisevic, Roland and Hinton, Geoffrey E.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2007},
doi = {10.1109/CVPR.2007.383036},
url = {https://mlanthology.org/cvpr/2007/memisevic2007cvpr-unsupervised/}
}