Parsimonious Bayesian Deep Networks

Abstract

Combining Bayesian nonparametrics and a forward model selection strategy, we construct parsimonious Bayesian deep networks (PBDNs) that infer capacity-regularized network architectures from the data and require neither cross-validation nor fine-tuning when training the model. One of the two essential components of a PBDN is the development of a special infinite-wide single-hidden-layer neural network, whose number of active hidden units can be inferred from the data. The other one is the construction of a greedy layer-wise learning algorithm that uses a forward model selection criterion to determine when to stop adding another hidden layer. We develop both Gibbs sampling and stochastic gradient descent based maximum a posteriori inference for PBDNs, providing state-of-the-art classification accuracy and interpretable data subtypes near the decision boundaries, while maintaining low computational complexity for out-of-sample prediction.

Cite

Text

Zhou. "Parsimonious Bayesian Deep Networks." Neural Information Processing Systems, 2018.

Markdown

[Zhou. "Parsimonious Bayesian Deep Networks." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/zhou2018neurips-parsimonious/)

BibTeX

@inproceedings{zhou2018neurips-parsimonious,
  title     = {{Parsimonious Bayesian Deep Networks}},
  author    = {Zhou, Mingyuan},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {3190-3200},
  url       = {https://mlanthology.org/neurips/2018/zhou2018neurips-parsimonious/}
}