Depth Separation with Multilayer Mean-Field Networks

Abstract

Depth separation—why a deeper network is more powerful than a shallow one—has been a major problem in deep learning theory. Previous results often focus on representation power, for example, Safran et al. (2019) constructed a function that is easy to approximate using a 3-layer network but not approximable by any 2-layer network. In this paper, we show that this separation is in fact algorithmic: one can learn the function constructed by Safran et al. (2019) using an overparametrized network with polynomially many neurons efficiently. Our result relies on a new way of extending the mean-field limit to multilayer networks, and a decomposition of loss that factors out the error introduced by the discretization of infinite-width mean-field networks.

Cite

Text

Ren et al. "Depth Separation with Multilayer Mean-Field Networks." International Conference on Learning Representations, 2023.

Markdown

[Ren et al. "Depth Separation with Multilayer Mean-Field Networks." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/ren2023iclr-depth/)

BibTeX

@inproceedings{ren2023iclr-depth,
  title     = {{Depth Separation with Multilayer Mean-Field Networks}},
  author    = {Ren, Yunwei and Zhou, Mo and Ge, Rong},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/ren2023iclr-depth/}
}