Sparsifying Bayesian Neural Networks with Latent Binary Variables and Normalizing Flows

Abstract

Artificial neural networks are powerful machine learning methods used in many modern applications. A common issue is that they have millions or billions of parameters, and therefore tend to overfit. Bayesian neural networks (BNN) can improve on this since they incorporate parameter uncertainty. Latent binary Bayesian neural networks (LBBNN) further take into account structural uncertainty by allowing the weights to be turned on or off, enabling inference in the joint space of weights and structures. Mean-field variational inference is typically used for computation within such models. In this paper, we will consider two extensions of variational inference for the LBBNN: Firstly, by using the local reparametrization trick (LCRT), we improve computational efficiency. Secondly, and more importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, we learn a more flexible variational posterior than the mean field Gaussian. Experimental results on real data show that this improves predictive power compared to using mean field variational inference on the LBBNN method, while also obtaining sparser networks. We also perform two simulation studies. In the first, we consider variable selection in a logistic regression setting, where the more flexible variational distribution improves results. In the second study, we compare predictive uncertainty based on data generated from two-dimensional Gaussian distributions. Here, we argue that our Bayesian methods lead to more realistic estimates of predictive uncertainty.

Cite

Text

Skaaret-Lund et al. "Sparsifying Bayesian Neural Networks with Latent Binary Variables and Normalizing Flows." Transactions on Machine Learning Research, 2024.

Markdown

[Skaaret-Lund et al. "Sparsifying Bayesian Neural Networks with Latent Binary Variables and Normalizing Flows." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/skaaretlund2024tmlr-sparsifying/)

BibTeX

@article{skaaretlund2024tmlr-sparsifying,
  title     = {{Sparsifying Bayesian Neural Networks with Latent Binary Variables and Normalizing Flows}},
  author    = {Skaaret-Lund, Lars and Storvik, Geir and Hubin, Aliaksandr},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/skaaretlund2024tmlr-sparsifying/}
}