Memorization with Neural Nets: Going Beyond the Worst Case

Abstract

In practice, deep neural networks are often able to easily interpolate their training data. To understand this phenomenon, many works have aimed to quantify the memorization capacity of a neural network architecture: the largest number of points such that the architecture can interpolate any placement of these points with any assignment of labels. For real-world data, however, one intuitively expects the presence of a benign structure so that interpolation already occurs at a smaller network size than suggested by memorization capacity. In this paper, we investigate interpolation by adopting an instance-specific viewpoint. We introduce a simple randomized algorithm that, given a fixed finite data set with two classes, with high probability constructs an interpolating three-layer neural network in polynomial time. The required number of parameters is linked to geometric properties of the two classes and their mutual arrangement. As a result, we obtain guarantees that are independent of the number of samples and hence move beyond worst-case memorization capacity bounds. We verify our theoretical result with numerical experiments and additionally investigate the effectiveness of the algorithm on MNIST and CIFAR-10.

Cite

Text

Dirksen et al. "Memorization with Neural Nets: Going Beyond the Worst Case." Journal of Machine Learning Research, 2024.

Markdown

[Dirksen et al. "Memorization with Neural Nets: Going Beyond the Worst Case." Journal of Machine Learning Research, 2024.](https://mlanthology.org/jmlr/2024/dirksen2024jmlr-memorization/)

BibTeX

@article{dirksen2024jmlr-memorization,
  title     = {{Memorization with Neural Nets: Going Beyond the Worst Case}},
  author    = {Dirksen, Sjoerd and Finke, Patrick and Genzel, Martin},
  journal   = {Journal of Machine Learning Research},
  year      = {2024},
  pages     = {1-38},
  volume    = {25},
  url       = {https://mlanthology.org/jmlr/2024/dirksen2024jmlr-memorization/}
}