Autoencoding Beyond Pixels Using a Learned Similarity Metric

Abstract

We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder (VAE) with a generative adversarial network (GAN) we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic.

Cite

Text

Larsen et al. "Autoencoding Beyond Pixels Using a Learned Similarity Metric." International Conference on Machine Learning, 2016.

Markdown

[Larsen et al. "Autoencoding Beyond Pixels Using a Learned Similarity Metric." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/larsen2016icml-autoencoding/)

BibTeX

@inproceedings{larsen2016icml-autoencoding,
  title     = {{Autoencoding Beyond Pixels Using a Learned Similarity Metric}},
  author    = {Larsen, Anders Boesen Lindbo and Sønderby, Søren Kaae and Larochelle, Hugo and Winther, Ole},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {1558-1566},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/larsen2016icml-autoencoding/}
}