Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape

Abstract

We study the loss landscape of both shallow and deep, mildly overparameterized ReLU neural networks on a generic finite input dataset for the squared error loss. We show both by count and volume that most activation patterns correspond to parameter regions with no bad local minima. Furthermore, for one-dimensional input data, we show most activation regions realizable by the network contain a high dimensional set of global minima and no bad local minima. We experimentally confirm these results by finding a phase transition from most regions having full rank Jacobian to many regions having deficient rank depending on the amount of overparameterization.

Cite

Text

Karhadkar et al. "Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape." Transactions on Machine Learning Research, 2024.

Markdown

[Karhadkar et al. "Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/karhadkar2024tmlr-mildly/)

BibTeX

@article{karhadkar2024tmlr-mildly,
  title     = {{Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape}},
  author    = {Karhadkar, Kedar and Murray, Michael and Tseran, Hanna and Montufar, Guido},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/karhadkar2024tmlr-mildly/}
}