Practical Variational Inference for Neural Networks

Abstract

Variational methods have been previously explored as a tractable approximation to Bayesian inference for neural networks. However the approaches proposed so far have only been applicable to a few simple network architectures. This paper introduces an easy-to-implement stochastic variational method (or equivalently, minimum description length loss function) that can be applied to most neural networks. Along the way it revisits several common regularisers from a variational perspective. It also provides a simple pruning heuristic that can both drastically reduce the number of network weights and lead to improved generalisation. Experimental results are provided for a hierarchical multidimensional recurrent neural network applied to the TIMIT speech corpus.

Cite

Text

Graves. "Practical Variational Inference for Neural Networks." Neural Information Processing Systems, 2011.

Markdown

[Graves. "Practical Variational Inference for Neural Networks." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/graves2011neurips-practical/)

BibTeX

@inproceedings{graves2011neurips-practical,
  title     = {{Practical Variational Inference for Neural Networks}},
  author    = {Graves, Alex},
  booktitle = {Neural Information Processing Systems},
  year      = {2011},
  pages     = {2348-2356},
  url       = {https://mlanthology.org/neurips/2011/graves2011neurips-practical/}
}