Random Feature Amplification: Feature Learning and Generalization in Neural Networks

Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

JMLR 2023 pp. 1-49

/jmlr/2023/frei2023jmlr-random/

Abstract

In this work, we provide a characterization of the feature-learning process in two-layer ReLU networks trained by gradient descent on the logistic loss following random initialization. We consider data with binary labels that are generated by an XOR-like function of the input features. We permit a constant fraction of the training labels to be corrupted by an adversary. We show that, although linear classifiers are no better than random guessing for the distribution we consider, two-layer ReLU networks trained by gradient descent achieve generalization error close to the label noise rate. We develop a novel proof technique that shows that at initialization, the vast majority of neurons function as random features that are only weakly correlated with useful features, and the gradient descent dynamics `amplify’ these weak, random features to strong, useful features.

PDF JMLR Semantic Scholar

Cite

Text

Frei et al. "Random Feature Amplification: Feature Learning and Generalization in Neural Networks." Journal of Machine Learning Research, 2023.

Markdown

[Frei et al. "Random Feature Amplification: Feature Learning and Generalization in Neural Networks." Journal of Machine Learning Research, 2023.](https://mlanthology.org/jmlr/2023/frei2023jmlr-random/)

BibTeX

@article{frei2023jmlr-random,
  title     = {{Random Feature Amplification: Feature Learning and Generalization in Neural Networks}},
  author    = {Frei, Spencer and Chatterji, Niladri S. and Bartlett, Peter L.},
  journal   = {Journal of Machine Learning Research},
  year      = {2023},
  pages     = {1-49},
  volume    = {24},
  url       = {https://mlanthology.org/jmlr/2023/frei2023jmlr-random/}
}