Provable Tempered Overfitting of Minimal Nets and Typical Nets

Abstract

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights that perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having a minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that apply to deep neural networks and do not require an extremely high or low input dimension.

Cite

Text

Harel et al. "Provable Tempered Overfitting of Minimal Nets and Typical Nets." ICML 2024 Workshops: HiLD, 2024.

Markdown

[Harel et al. "Provable Tempered Overfitting of Minimal Nets and Typical Nets." ICML 2024 Workshops: HiLD, 2024.](https://mlanthology.org/icmlw/2024/harel2024icmlw-provable/)

BibTeX

@inproceedings{harel2024icmlw-provable,
  title     = {{Provable Tempered Overfitting of Minimal Nets and Typical Nets}},
  author    = {Harel, Itamar and Hoza, William M. and Vardi, Gal and Evron, Itay and Srebro, Nathan and Soudry, Daniel},
  booktitle = {ICML 2024 Workshops: HiLD},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/harel2024icmlw-provable/}
}