Markovian Score Climbing: Variational Inference with KL(p||q)

Abstract

Modern variational inference (VI) uses stochastic gradients to avoid intractable expectations, enabling large-scale probabilistic inference in complex models. VI posits a family of approximating distributions q and then finds the member of that family that is closest to the exact posterior p. Traditionally, VI algorithms minimize the “exclusive Kullback-Leibler (KL)” KL(q||p), often for computational convenience. Recent research, however, has also focused on the “inclusive KL” KL(p||q), which has good statistical properties that makes it more appropriate for certain inference problems. This paper develops a simple algorithm for reliably minimizing the inclusive KL using stochastic gradients with vanishing bias. This method, which we call Markovian score climbing (MSC), converges to a local optimum of the inclusive KL. It does not suffer from the systematic errors inherent in existing methods, such as Reweighted Wake-Sleep and Neural Adaptive Sequential Monte Carlo, which lead to bias in their final estimates. We illustrate convergence on a toy model and demonstrate the utility of MSC on Bayesian probit regression for classification as well as a stochastic volatility model for financial data.

Cite

Text

Naesseth et al. "Markovian Score Climbing: Variational Inference with KL(p||q)." Neural Information Processing Systems, 2020.

Markdown

[Naesseth et al. "Markovian Score Climbing: Variational Inference with KL(p||q)." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/naesseth2020neurips-markovian/)

BibTeX

@inproceedings{naesseth2020neurips-markovian,
  title     = {{Markovian Score Climbing: Variational Inference with KL(p||q)}},
  author    = {Naesseth, Christian and Lindsten, Fredrik and Blei, David M.},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/naesseth2020neurips-markovian/}
}