The Curse of Depth in Kernel Regime

Abstract

Recent work by Jacot et al. (2018) has shown that training a neural network of any kind with gradient descent is strongly related to kernel gradient descent in function space with respect to the Neural Tangent Kernel (NTK). Empirical results in (Lee et al., 2019) demonstrated high performance of a linearized version of training using the so-called NTK regime. In this paper, we show that the large depth limit of this regime is unexpectedly trivial, and we fully characterize the convergence rate to this trivial regime.

Cite

Text

Hayou et al. "The Curse of Depth in Kernel Regime." NeurIPS 2021 Workshops: ICBINB, 2021.

Markdown

[Hayou et al. "The Curse of Depth in Kernel Regime." NeurIPS 2021 Workshops: ICBINB, 2021.](https://mlanthology.org/neuripsw/2021/hayou2021neuripsw-curse/)

BibTeX

@inproceedings{hayou2021neuripsw-curse,
  title     = {{The Curse of Depth in Kernel Regime}},
  author    = {Hayou, Soufiane and Doucet, Arnaud and Rousseau, Judith},
  booktitle = {NeurIPS 2021 Workshops: ICBINB},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/hayou2021neuripsw-curse/}
}