Efficient Low Rank Gaussian Variational Inference for Neural Networks

Abstract

Bayesian neural networks are enjoying a renaissance driven in part by recent advances in variational inference (VI). The most common form of VI employs a fully factorized or mean-field distribution, but this is known to suffer from several pathologies, especially as we expect posterior distributions with highly correlated parameters. Current algorithms that capture these correlations with a Gaussian approximating family are difficult to scale to large models due to computational costs and high variance of gradient updates. By using a new form of the reparametrization trick, we derive a computationally efficient algorithm for performing VI with a Gaussian family with a low-rank plus diagonal covariance structure. We scale to deep feed-forward and convolutional architectures. We find that adding low-rank terms to parametrized diagonal covariance does not improve predictive performance except on small networks, but low-rank terms added to a constant diagonal covariance improves performance on small and large-scale network architectures.

Cite

Text

Tomczak et al. "Efficient Low Rank Gaussian Variational Inference for Neural Networks." Neural Information Processing Systems, 2020.

Markdown

[Tomczak et al. "Efficient Low Rank Gaussian Variational Inference for Neural Networks." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/tomczak2020neurips-efficient/)

BibTeX

@inproceedings{tomczak2020neurips-efficient,
  title     = {{Efficient Low Rank Gaussian Variational Inference for Neural Networks}},
  author    = {Tomczak, Marcin and Swaroop, Siddharth and Turner, Richard},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/tomczak2020neurips-efficient/}
}