Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein

Abstract

Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets. Traditionally, this involves using dimensionality reduction (DR) methods to project data onto lower-dimensional spaces or organizing points into meaningful clusters (clustering). In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem. This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem. We empirically demonstrate its relevance to the identification of low-dimensional prototypes representing data at different scales, across multiple image and genomic datasets.

Cite

Text

Van Assel et al. "Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein." Transactions on Machine Learning Research, 2025.

Markdown

[Van Assel et al. "Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/assel2025tmlr-distributional/)

BibTeX

@article{assel2025tmlr-distributional,
  title     = {{Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein}},
  author    = {Van Assel, Hugues and Vincent-Cuaz, Cédric and Courty, Nicolas and Flamary, Rémi and Frossard, Pascal and Vayer, Titouan},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/assel2025tmlr-distributional/}
}