Understanding Stochastic Natural Gradient Variational Inference
Abstract
Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the stochastic setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first $\mathcal{O}(\frac{1}{T})$ non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent (a.k.a. black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of $\mathcal{O}(\frac{1}{T})$ is unlikely without some significant new understanding of optimizing the ELBO using natural gradients.
Cite
Text
Wu and Gardner. "Understanding Stochastic Natural Gradient Variational Inference." International Conference on Machine Learning, 2024.Markdown
[Wu and Gardner. "Understanding Stochastic Natural Gradient Variational Inference." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/wu2024icml-understanding/)BibTeX
@inproceedings{wu2024icml-understanding,
title = {{Understanding Stochastic Natural Gradient Variational Inference}},
author = {Wu, Kaiwen and Gardner, Jacob R.},
booktitle = {International Conference on Machine Learning},
year = {2024},
pages = {53398-53421},
volume = {235},
url = {https://mlanthology.org/icml/2024/wu2024icml-understanding/}
}