Computation-Information Gap in High-Dimensional Clustering

Abstract

We investigate the existence of a fundamental computation-information gap for the problem of clustering a mixture of isotropic Gaussian in the high-dimensional regime, where the ambient dimension $p$ is larger than the number $n$ of points. The existence of a computation-information gap in a specific Bayesian high-dimensional asymptotic regime has been conjectured by Lesieur et. al (2016) based on the replica heuristic from statistical physics. We provide evidence of the existence of such a gap generically in the high-dimensional regime $p\geq n$, by (i) proving a non-asymptotic low-degree polynomials computational barrier for clustering in high-dimension, matching the performance of the best known polynomial time algorithms, and by (ii) establishing that the information barrier for clustering is smaller than the computational barrier, when the number $K$ of clusters is large enough. These results are in contrast with the (moderately) low-dimensional regime $n\geq \text{poly}(p,K)$, where there is no computation-information gap for clustering a mixture of isotropic Gaussian. In order to prove our low-degree computational barrier, we develop sophisticated combinatorial arguments to upper-bound the mixed moments of the signal under a Bernoulli Bayesian model.

Cite

Text

Even et al. "Computation-Information Gap in High-Dimensional Clustering." Conference on Learning Theory, 2024.

Markdown

[Even et al. "Computation-Information Gap in High-Dimensional Clustering." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/even2024colt-computationinformation/)

BibTeX

@inproceedings{even2024colt-computationinformation,
  title     = {{Computation-Information Gap in High-Dimensional Clustering}},
  author    = {Even, Bertrand and Giraud, Christophe and Verzelen, Nicolas},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {1646-1712},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/even2024colt-computationinformation/}
}