Learning Disentangled Discrete Representations

Abstract

Recent successes in image generation, model-based reinforcement learning, and text-to-image generation have demonstrated the empirical advantages of discrete latent representations, although the reasons behind their benefits remain unclear. We explore the relationship between discrete latent spaces and disentangled representations by replacing the standard Gaussian variational autoencoder (VAE) with a tailored categorical variational autoencoder. We show that the underlying grid structure of categorical distributions mitigates the problem of rotational invariance associated with multivariate Gaussian distributions, acting as an efficient inductive prior for disentangled representations. We provide both analytical and empirical findings that demonstrate the advantages of discrete VAEs for learning disentangled representations. Furthermore, we introduce the first unsupervised model selection strategy that favors disentangled representations.

Cite

Text

Friede et al. "Learning Disentangled Discrete Representations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43421-1_35

Markdown

[Friede et al. "Learning Disentangled Discrete Representations." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/friede2023ecmlpkdd-learning/) doi:10.1007/978-3-031-43421-1_35

BibTeX

@inproceedings{friede2023ecmlpkdd-learning,
  title     = {{Learning Disentangled Discrete Representations}},
  author    = {Friede, David and Reimers, Christian and Stuckenschmidt, Heiner and Niepert, Mathias},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {593-609},
  doi       = {10.1007/978-3-031-43421-1_35},
  url       = {https://mlanthology.org/ecmlpkdd/2023/friede2023ecmlpkdd-learning/}
}