Modern Neural Networks Generalize on Small Data Sets

Abstract

In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.

Cite

Text

Olson et al. "Modern Neural Networks Generalize on Small Data Sets." Neural Information Processing Systems, 2018.

Markdown

[Olson et al. "Modern Neural Networks Generalize on Small Data Sets." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/olson2018neurips-modern/)

BibTeX

@inproceedings{olson2018neurips-modern,
  title     = {{Modern Neural Networks Generalize on Small Data Sets}},
  author    = {Olson, Matthew and Wyner, Abraham and Berk, Richard},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {3619-3628},
  url       = {https://mlanthology.org/neurips/2018/olson2018neurips-modern/}
}