Extracting Invariant Features from Images Using an Equivariant Autoencoder
Abstract
Convolutional Neural Networks achieve state of the art results in many image recognition tasks. While their structure makes predictions invariant to small translations, some recognition tasks require invariance to other transformations, like rotation and reflection. We apply group convolutions to build an Equivariant Autoencoder with embeddings that change predictably under the specified set of transformations. We then introduce two approaches to extracting invariant features from these embeddings—Gram Pooling and Equivariant Attention. These two methods separate transformation-relevant information from all other image features. We use obtained embeddings in classification and clustering tasks and show an improvement of the classification quality on the learned embeddings compared to pure autoencoder and average pooling method. A visualization of the learned manifold shows that objects of the same class tend to cluster together, which was not observed for the pure autoencoder embeddings.
Cite
Text
Kuzminykh et al. "Extracting Invariant Features from Images Using an Equivariant Autoencoder." Proceedings of The 10th Asian Conference on Machine Learning, 2018.Markdown
[Kuzminykh et al. "Extracting Invariant Features from Images Using an Equivariant Autoencoder." Proceedings of The 10th Asian Conference on Machine Learning, 2018.](https://mlanthology.org/acml/2018/kuzminykh2018acml-extracting/)BibTeX
@inproceedings{kuzminykh2018acml-extracting,
title = {{Extracting Invariant Features from Images Using an Equivariant Autoencoder}},
author = {Kuzminykh, Denis and Polykovskiy, Daniil and Zhebrak, Alexander},
booktitle = {Proceedings of The 10th Asian Conference on Machine Learning},
year = {2018},
pages = {438-453},
volume = {95},
url = {https://mlanthology.org/acml/2018/kuzminykh2018acml-extracting/}
}