Near-Optimal Sample Complexity Bounds for Learning Latent $k-$polytopes and Applications to Ad-Mixtures

Abstract

Deriving Optimal bounds on Sample Complexity of Latent Variable models is an active area of research. Recently such bounds were obtained for Mixture of Gaussians \cite{HSNCAY18}, no such results are known for Ad-mixtures, a generalization of Mixture distributions. In this paper we show that $O^*(dk/m)$ samples are sufficient to learn each of $k-$ topic vectors of LDA, a popular Ad-mixture model, with vocabulary size $d$ and $m\in \Omega(1)$ words per document, to any constant error in $L_1$ norm. The result is a corollary of the major contribution of this paper: the first sample complexity upper bound for the problem (introduced in \cite{BK20}) of learning the vertices of a Latent $k-$ Polytope in $\RR^d$, given perturbed points from it. The bound, $O^*(dk/\beta)$, is optimal and linear in number of parameters. It applies to many stochastic models including a broad class Ad-mixtures. To demonstrate the generality of the approach we specialize the setting to Mixed Membership Stochastic Block Models(MMSB) and show for the first time that if an MMSB has $k$ blocks, the sample complexity is $O^*(k^2)$ under usual assumptions.

Cite

Text

Bhattacharyya and Kannan. "Near-Optimal Sample Complexity Bounds for Learning Latent $k-$polytopes and Applications to Ad-Mixtures." International Conference on Machine Learning, 2020.

Markdown

[Bhattacharyya and Kannan. "Near-Optimal Sample Complexity Bounds for Learning Latent $k-$polytopes and Applications to Ad-Mixtures." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/bhattacharyya2020icml-nearoptimal/)

BibTeX

@inproceedings{bhattacharyya2020icml-nearoptimal,
  title     = {{Near-Optimal Sample Complexity Bounds for Learning Latent $k-$polytopes and Applications to Ad-Mixtures}},
  author    = {Bhattacharyya, Chiranjib and Kannan, Ravindran},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {854-863},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/bhattacharyya2020icml-nearoptimal/}
}