Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks
Abstract
Since statistical guarantees for neural networks are usually restricted to global optima of intricate objective functions, it is unclear whether these theories explain the performances of actual outputs of neural network pipelines. The goal of this paper is, therefore, to bring statistical theory closer to practice. We develop statistical guarantees for shallow linear neural networks that coincide up to logarithmic factors with the global optima but apply to stationary points and the points nearby. These results support the common notion that neural networks do not necessarily need to be optimized globally from a mathematical perspective. We then extend our statistical guarantees to shallow ReLU neural networks, assuming the first layer weight matrices are nearly identical for the stationary network and the target. More generally, despite being limited to shallow neural networks for now, our theories make an important step forward in describing the practical properties of neural networks in mathematical terms.
Cite
Text
Taheri et al. "Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks." Transactions on Machine Learning Research, 2025.Markdown
[Taheri et al. "Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/taheri2025tmlr-statistical/)BibTeX
@article{taheri2025tmlr-statistical,
title = {{Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks}},
author = {Taheri, Mahsa and Xie, Fang and Lederer, Johannes},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/taheri2025tmlr-statistical/}
}