Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

Zhang, Ruqi; Li, Chunyuan; Zhang, Jianyi; Chen, Changyou; Wilson, Andrew Gordon

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson

ICLR 2020

/iclr/2020/zhang2020iclr-cyclical/

Abstract

The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We prove non-asymptotic convergence theory of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.

PDF ICLR Semantic Scholar

Cite

Text

Zhang et al. "Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning." International Conference on Learning Representations, 2020.

Markdown

[Zhang et al. "Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/zhang2020iclr-cyclical/)

BibTeX

@inproceedings{zhang2020iclr-cyclical,
  title     = {{Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning}},
  author    = {Zhang, Ruqi and Li, Chunyuan and Zhang, Jianyi and Chen, Changyou and Wilson, Andrew Gordon},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/zhang2020iclr-cyclical/}
}