Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes

Hassan Ashtiani, Shai Ben-David, Nicholas Harvey, Christopher Liaw, Abbas Mehrabian, Yaniv Plan

NeurIPS 2018 pp. 3412-3421

/neurips/2018/ashtiani2018neurips-nearly/

Abstract

We prove that ϴ(k d^2 / ε^2) samples are necessary and sufficient for learning a mixture of k Gaussians in R^d, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that O(k d / ε^2) samples suffice, matching a known lower bound. The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in R^d has an efficient sample compression.

PDF NeurIPS Semantic Scholar

Cite

Text

Ashtiani et al. "Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes." Neural Information Processing Systems, 2018.

Markdown

[Ashtiani et al. "Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/ashtiani2018neurips-nearly/)

BibTeX

@inproceedings{ashtiani2018neurips-nearly,
  title     = {{Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians via Sample Compression Schemes}},
  author    = {Ashtiani, Hassan and Ben-David, Shai and Harvey, Nicholas and Liaw, Christopher and Mehrabian, Abbas and Plan, Yaniv},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {3412-3421},
  url       = {https://mlanthology.org/neurips/2018/ashtiani2018neurips-nearly/}
}