What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

Abstract

Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers and mislabeled data points. Such fitting requires memorization of training data labels, a phenomenon that has attracted significant research interest but has not been given a compelling explanation so far. A recent work of Feldman (2019) proposes a theoretical explanation for this phenomenon based on a combination of two insights. First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples. Second, in a simple theoretical model such memorization is necessary for achieving close-to-optimal generalization error when the data distribution is long-tailed. However, no direct empirical evidence for this explanation or even an approach for obtaining such evidence were given.

Cite

Text

Feldman and Zhang. "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation." Neural Information Processing Systems, 2020.

Markdown

[Feldman and Zhang. "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/feldman2020neurips-neural/)

BibTeX

@inproceedings{feldman2020neurips-neural,
  title     = {{What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation}},
  author    = {Feldman, Vitaly and Zhang, Chiyuan},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/feldman2020neurips-neural/}
}