Non-Vacuous Generalisation Bounds for Shallow Neural Networks
Abstract
We focus on a specific class of shallow neural networks with a single hidden layer, namely those with $L_2$-normalised data and either a sigmoid-shaped Gaussian error function (“erf”) activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.
Cite
Text
Biggs and Guedj. "Non-Vacuous Generalisation Bounds for Shallow Neural Networks." International Conference on Machine Learning, 2022.Markdown
[Biggs and Guedj. "Non-Vacuous Generalisation Bounds for Shallow Neural Networks." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/biggs2022icml-nonvacuous/)BibTeX
@inproceedings{biggs2022icml-nonvacuous,
title = {{Non-Vacuous Generalisation Bounds for Shallow Neural Networks}},
author = {Biggs, Felix and Guedj, Benjamin},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {1963-1981},
volume = {162},
url = {https://mlanthology.org/icml/2022/biggs2022icml-nonvacuous/}
}