On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo

Abstract

We provide convergence guarantees in Wasserstein distance for a variety of variance-reduction methods: SAGA Langevin diffusion, SVRG Langevin diffusion and control-variate underdamped Langevin diffusion. We analyze these methods under a uniform set of assumptions on the log-posterior distribution, assuming it to be smooth, strongly convex and Hessian Lipschitz. This is achieved by a new proof technique combining ideas from finite-sum optimization and the analysis of sampling methods. Our sharp theoretical bounds allow us to identify regimes of interest where each method performs better than the others. Our theory is verified with experiments on real-world and synthetic datasets.

Cite

Text

Chatterji et al. "On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo." International Conference on Machine Learning, 2018.

Markdown

[Chatterji et al. "On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/chatterji2018icml-theory/)

BibTeX

@inproceedings{chatterji2018icml-theory,
  title     = {{On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo}},
  author    = {Chatterji, Niladri and Flammarion, Nicolas and Ma, Yian and Bartlett, Peter and Jordan, Michael},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {764-773},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/chatterji2018icml-theory/}
}