A Modern Take on the Bias-Variance Tradeoff in Neural Networks
Abstract
Recent empirical results on over-parameterized deep networks are marked by a striking absence of the classic U-shaped test error curve: test error keeps decreasing in wider networks. Researchers are actively working on bridging this discrepancy by proposing better complexity measures. Instead, we directly measure prediction bias and variance for four classification and regression tasks on modern deep networks. We find that both bias and variance can decrease as the number of parameters grows. Qualitatively, the phenomenon persists over a number of gradient-based optimizers. To better understand the role of optimization, we decompose the total variance into variance due to training set sampling and variance due to initialization. Variance due to initialization is significant in the under-parameterized regime. In the over-parameterized regime, total variance is much lower and dominated by variance due to sampling. We provide theoretical analysis in a simplified setting that is consistent with our empirical findings.
Cite
Text
Neal et al. "A Modern Take on the Bias-Variance Tradeoff in Neural Networks." ICML 2019 Workshops: Deep_Phenomena, 2019.Markdown
[Neal et al. "A Modern Take on the Bias-Variance Tradeoff in Neural Networks." ICML 2019 Workshops: Deep_Phenomena, 2019.](https://mlanthology.org/icmlw/2019/neal2019icmlw-modern/)BibTeX
@inproceedings{neal2019icmlw-modern,
title = {{A Modern Take on the Bias-Variance Tradeoff in Neural Networks}},
author = {Neal, Brady and Mittal, Sarthak and Baratin, Aristide and Tantia, Vinayak and Scicluna, Matthew and Lacoste-Julien, Simon and Mitliagkas, Ioannis},
booktitle = {ICML 2019 Workshops: Deep_Phenomena},
year = {2019},
url = {https://mlanthology.org/icmlw/2019/neal2019icmlw-modern/}
}