An Investigation of Computational and Informational Limits in Gaussian Mixture Clustering

Abstract

We investigate under what conditions clustering by learning a mixture of spherical Gaussians is (a) computationally tractable; and (b) statistically possible. We show that using principal component projection greatly aids in recovering the clustering using EM; present empirical evidence that even using such a projection, there is still a large gap between the number of samples needed to recover the clustering using EM, and the number of samples needed without computational restrictions; and characterize the regime in which such a gap exists.

Cite

Text

Srebro et al. "An Investigation of Computational and Informational Limits in Gaussian Mixture Clustering." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143953

Markdown

[Srebro et al. "An Investigation of Computational and Informational Limits in Gaussian Mixture Clustering." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/srebro2006icml-investigation/) doi:10.1145/1143844.1143953

BibTeX

@inproceedings{srebro2006icml-investigation,
  title     = {{An Investigation of Computational and Informational Limits in Gaussian Mixture Clustering}},
  author    = {Srebro, Nathan and Shakhnarovich, Gregory and Roweis, Sam T.},
  booktitle = {International Conference on Machine Learning},
  year      = {2006},
  pages     = {865-872},
  doi       = {10.1145/1143844.1143953},
  url       = {https://mlanthology.org/icml/2006/srebro2006icml-investigation/}
}