Practical Variational Inference for Neural Networks
Abstract
Variational methods have been previously explored as a tractable approximation to Bayesian inference for neural networks. However the approaches proposed so far have only been applicable to a few simple network architectures. This paper introduces an easy-to-implement stochastic variational method (or equivalently, minimum description length loss function) that can be applied to most neural networks. Along the way it revisits several common regularisers from a variational perspective. It also provides a simple pruning heuristic that can both drastically reduce the number of network weights and lead to improved generalisation. Experimental results are provided for a hierarchical multidimensional recurrent neural network applied to the TIMIT speech corpus.
Cite
Text
Graves. "Practical Variational Inference for Neural Networks." Neural Information Processing Systems, 2011.Markdown
[Graves. "Practical Variational Inference for Neural Networks." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/graves2011neurips-practical/)BibTeX
@inproceedings{graves2011neurips-practical,
title = {{Practical Variational Inference for Neural Networks}},
author = {Graves, Alex},
booktitle = {Neural Information Processing Systems},
year = {2011},
pages = {2348-2356},
url = {https://mlanthology.org/neurips/2011/graves2011neurips-practical/}
}