Provable Tempered Overfitting of Minimal Nets and Typical Nets

Abstract

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.

Cite

Text

Harel et al. "Provable Tempered Overfitting of Minimal Nets and Typical Nets." Neural Information Processing Systems, 2024. doi:10.52202/079017-1694

Markdown

[Harel et al. "Provable Tempered Overfitting of Minimal Nets and Typical Nets." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/harel2024neurips-provable/) doi:10.52202/079017-1694

BibTeX

@inproceedings{harel2024neurips-provable,
  title     = {{Provable Tempered Overfitting of Minimal Nets and Typical Nets}},
  author    = {Harel, Itamar and Hoza, William M. and Vardi, Gal and Evron, Itay and Srebro, Nathan and Soudry, Daniel},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1694},
  url       = {https://mlanthology.org/neurips/2024/harel2024neurips-provable/}
}