Wide Stochastic Networks: Gaussian Limit and PAC-Bayesian Training

Abstract

The limit of infinite width allows for substantial simplifications in the analytical study of over- parameterised neural networks. With a suitable random initialisation, an extremely large network exhibits an approximately Gaussian behaviour. In the present work, we establish a similar result for a simple stochastic architecture whose parameters are random variables, holding both before and during training. The explicit evaluation of the output distribution allows for a PAC-Bayesian training procedure that directly optimises the generalisation bound. For a large but finite-width network, we show empirically on MNIST that this training approach can outperform standard PAC- Bayesian methods.

Cite

Text

Clerico et al. "Wide Stochastic Networks: Gaussian Limit and PAC-Bayesian Training." Proceedings of The 34th International Conference on Algorithmic Learning Theory, 2023.

Markdown

[Clerico et al. "Wide Stochastic Networks: Gaussian Limit and PAC-Bayesian Training." Proceedings of The 34th International Conference on Algorithmic Learning Theory, 2023.](https://mlanthology.org/alt/2023/clerico2023alt-wide/)

BibTeX

@inproceedings{clerico2023alt-wide,
  title     = {{Wide Stochastic Networks: Gaussian Limit and PAC-Bayesian Training}},
  author    = {Clerico, Eugenio and Deligiannidis, George and Doucet, Arnaud},
  booktitle = {Proceedings of The 34th International Conference on Algorithmic Learning Theory},
  year      = {2023},
  pages     = {447-470},
  volume    = {201},
  url       = {https://mlanthology.org/alt/2023/clerico2023alt-wide/}
}