Truth or Backpropaganda? an Empirical Investigation of Deep Learning Theory
Abstract
We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.
Cite
Text
Goldblum et al. "Truth or Backpropaganda? an Empirical Investigation of Deep Learning Theory." International Conference on Learning Representations, 2020.Markdown
[Goldblum et al. "Truth or Backpropaganda? an Empirical Investigation of Deep Learning Theory." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/goldblum2020iclr-truth/)BibTeX
@inproceedings{goldblum2020iclr-truth,
title = {{Truth or Backpropaganda? an Empirical Investigation of Deep Learning Theory}},
author = {Goldblum, Micah and Geiping, Jonas and Schwarzschild, Avi and Moeller, Michael and Goldstein, Tom},
booktitle = {International Conference on Learning Representations},
year = {2020},
url = {https://mlanthology.org/iclr/2020/goldblum2020iclr-truth/}
}