Convergence Rates for Gaussian Mixtures of Experts

Abstract

We provide a theoretical treatment of over-specified Gaussian mixtures of experts with covariate-free gating networks. We establish the convergence rates of the maximum likelihood estimation (MLE) for these models. Our proof technique is based on a novel notion of algebraic independence of the expert functions. Drawing on optimal transport, we establish a connection between the algebraic independence of the expert functions and a certain class of partial differential equations (PDEs) with respect to the parameters. Exploiting this connection allows us to derive convergence rates for parameter estimation.

Cite

Text

Ho et al. "Convergence Rates for Gaussian Mixtures of Experts." Journal of Machine Learning Research, 2022.

Markdown

[Ho et al. "Convergence Rates for Gaussian Mixtures of Experts." Journal of Machine Learning Research, 2022.](https://mlanthology.org/jmlr/2022/ho2022jmlr-convergence/)

BibTeX

@article{ho2022jmlr-convergence,
  title     = {{Convergence Rates for Gaussian Mixtures of Experts}},
  author    = {Ho, Nhat and Yang, Chiao-Yu and Jordan, Michael I.},
  journal   = {Journal of Machine Learning Research},
  year      = {2022},
  pages     = {1-81},
  volume    = {23},
  url       = {https://mlanthology.org/jmlr/2022/ho2022jmlr-convergence/}
}