Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models

Abstract

We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with $n>1$ components learns from data that are generated by a single ground truth Gaussian distribution. While results for the special case of 2-Gaussian mixtures are well-known, a general global convergence analysis for arbitrary $n$ remains unresolved and faces several new technical barriers since the convergence becomes sub-linear and non-monotonic. To address these challenges, we construct a novel likelihood-based convergence analysis framework and rigorously prove that gradient EM converges globally with a sublinear rate $O(1/\sqrt{t})$. This is the first global convergence result for Gaussian mixtures with more than $2$ components. The sublinear convergence rate is due to the algorithmic nature of learning over-parameterized GMM with gradient EM. We also identify a new emerging technical challenge for learning general over-parameterized GMM: the existence of bad local regions that can trap gradient EM for an exponential number of steps.

Cite

Text

Xu et al. "Toward Global Convergence of  Gradient EM for Over-Parameterized Gaussian Mixture Models." ICML 2024 Workshops: HiLD, 2024.

Markdown

[Xu et al. "Toward Global Convergence of  Gradient EM for Over-Parameterized Gaussian Mixture Models." ICML 2024 Workshops: HiLD, 2024.](https://mlanthology.org/icmlw/2024/xu2024icmlw-global/)

BibTeX

@inproceedings{xu2024icmlw-global,
  title     = {{Toward Global Convergence of  Gradient EM for Over-Parameterized Gaussian Mixture Models}},
  author    = {Xu, Weihang and Fazel, Maryam and Du, Simon Shaolei},
  booktitle = {ICML 2024 Workshops: HiLD},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/xu2024icmlw-global/}
}