ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders
Abstract
The variational autoencoder (VAE) is a popular deep latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a crucial design choice and it has strong ramifications for the model's performance such as finding the hidden explanatory factors of a dataset using the representations learned by the VAE. However the size of the latent dimension of the VAE is often treated as a hyperparameter estimated empirically through trial and error. To this end we propose a statistical formulation to discover the relevant latent factors required for modeling a dataset. In this work we use a hierarchical prior in the latent space that estimates the variance of the latent axes using the encoded data which identifies the relevant latent dimensions. For this we replace the fixed prior in the VAE objective function with a hierarchical prior keeping the remainder of the formulation unchanged. We call the proposed method the automatic relevancy detection in the variational autoencoder (ARD-VAE). We demonstrate the efficacy of the ARD-VAE on multiple benchmark datasets in finding the relevant latent dimensions and their effect on different evaluation metrics such as FID score and disentanglement analysis.
Cite
Text
Saha et al. "ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders." Winter Conference on Applications of Computer Vision, 2025.Markdown
[Saha et al. "ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/saha2025wacv-ardvae/)BibTeX
@inproceedings{saha2025wacv-ardvae,
title = {{ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders}},
author = {Saha, Surojit and Joshi, Sarang and Whitaker, Ross},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2025},
pages = {889-898},
url = {https://mlanthology.org/wacv/2025/saha2025wacv-ardvae/}
}