Understanding Stochastic Natural Gradient Variational Inference

Abstract

Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the stochastic setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first $\mathcal{O}(\frac{1}{T})$ non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent (a.k.a. black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of $\mathcal{O}(\frac{1}{T})$ is unlikely without some significant new understanding of optimizing the ELBO using natural gradients.

Cite

Text

Wu and Gardner. "Understanding Stochastic Natural Gradient Variational Inference." International Conference on Machine Learning, 2024.

Markdown

[Wu and Gardner. "Understanding Stochastic Natural Gradient Variational Inference." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/wu2024icml-understanding/)

BibTeX

@inproceedings{wu2024icml-understanding,
  title     = {{Understanding Stochastic Natural Gradient Variational Inference}},
  author    = {Wu, Kaiwen and Gardner, Jacob R.},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {53398-53421},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/wu2024icml-understanding/}
}