What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Abstract
Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers and mislabeled data points. Such fitting requires memorization of training data labels, a phenomenon that has attracted significant research interest but has not been given a compelling explanation so far. A recent work of Feldman (2019) proposes a theoretical explanation for this phenomenon based on a combination of two insights. First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples. Second, in a simple theoretical model such memorization is necessary for achieving close-to-optimal generalization error when the data distribution is long-tailed. However, no direct empirical evidence for this explanation or even an approach for obtaining such evidence were given.
Cite
Text
Feldman and Zhang. "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation." Neural Information Processing Systems, 2020.Markdown
[Feldman and Zhang. "What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/feldman2020neurips-neural/)BibTeX
@inproceedings{feldman2020neurips-neural,
title = {{What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation}},
author = {Feldman, Vitaly and Zhang, Chiyuan},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/feldman2020neurips-neural/}
}