Latent Simplex Position Model: High Dimensional Multi-View Clustering with Uncertainty Quantification

Abstract

High dimensional data often contain multiple facets, and several clustering patterns can co-exist under different variable subspaces, also known as the views. While multi-view clustering algorithms were proposed, the uncertainty quantification remains difficult --- a particular challenge is in the high complexity of estimating the cluster assignment probability under each view, and sharing information among views. In this article, we propose an approximate Bayes approach --- treating the similarity matrices generated over the views as rough first-stage estimates for the co-assignment probabilities; in its Kullback-Leibler neighborhood, we obtain a refined low-rank matrix, formed by the pairwise product of simplex coordinates. Interestingly, each simplex coordinate directly encodes the cluster assignment uncertainty. For multi-view clustering, we let each view draw a parameterization from a few candidates, leading to dimension reduction. With high model flexibility, the estimation can be efficiently carried out as a continuous optimization problem, hence enjoys gradient-based computation. The theory establishes the connection of this model to a random partition distribution under multiple views. Compared to single-view clustering approaches, substantially more interpretable results are obtained when clustering brains from a human traumatic brain injury study, using high-dimensional gene expression data.

Cite

Text

Duan. "Latent Simplex Position Model: High Dimensional Multi-View Clustering with Uncertainty Quantification." Journal of Machine Learning Research, 2020.

Markdown

[Duan. "Latent Simplex Position Model: High Dimensional Multi-View Clustering with Uncertainty Quantification." Journal of Machine Learning Research, 2020.](https://mlanthology.org/jmlr/2020/duan2020jmlr-latent/)

BibTeX

@article{duan2020jmlr-latent,
  title     = {{Latent Simplex Position Model: High Dimensional Multi-View Clustering with Uncertainty Quantification}},
  author    = {Duan, Leo L.},
  journal   = {Journal of Machine Learning Research},
  year      = {2020},
  pages     = {1-25},
  volume    = {21},
  url       = {https://mlanthology.org/jmlr/2020/duan2020jmlr-latent/}
}