Deep Generative Clustering with Multimodal Variational Autoencoders
Abstract
Multimodal VAEs have recently received significant attention as generative models for weakly-supervised learning with multiple heterogeneous modalities. In parallel, VAE-based methods have been explored as probabilistic approaches for clustering tasks. Our work lies at the intersection of these two research directions. We propose a novel multimodal VAE model, in which the latent space is extended to learn data clusters, leveraging shared information across modalities. Our experiments show that our proposed model improves generative performance over existing multimodal VAEs, particularly for unconditional generation. Furthermore, our method favorably compares to alternative clustering approaches, in weakly-supervised settings. Notably, we propose a post-hoc procedure that avoids the need for our method to have a priori knowledge of the true number of clusters, mitigating a critical limitation of previous clustering frameworks.
Cite
Text
Palumbo et al. "Deep Generative Clustering with Multimodal Variational Autoencoders." ICML 2023 Workshops: SPIGM, 2023.Markdown
[Palumbo et al. "Deep Generative Clustering with Multimodal Variational Autoencoders." ICML 2023 Workshops: SPIGM, 2023.](https://mlanthology.org/icmlw/2023/palumbo2023icmlw-deep-a/)BibTeX
@inproceedings{palumbo2023icmlw-deep-a,
title = {{Deep Generative Clustering with Multimodal Variational Autoencoders}},
author = {Palumbo, Emanuele and Laguna, Sonia and Chopard, Daphné and Vogt, Julia E},
booktitle = {ICML 2023 Workshops: SPIGM},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/palumbo2023icmlw-deep-a/}
}