Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks

Abstract

We provide several new results on the sample complexity of vector-valued linear predictors (parameterized by a matrix), and more generally neural networks. Focusing on size-independent bounds, where only the Frobenius norm distance of the parameters from some fixed reference matrix $W_0$ is controlled, we show that the sample complexity behavior can be surprisingly different than what we may expect considering the well-studied setting of scalar-valued linear predictors. This also leads to new sample complexity bounds for feed-forward neural networks, tackling some open questions in the literature, and establishing a new convex linear prediction problem that is provably learnable without uniform convergence.

Cite

Text

Magen and Shamir. "Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks." Neural Information Processing Systems, 2023.

Markdown

[Magen and Shamir. "Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/magen2023neurips-initializationdependent/)

BibTeX

@inproceedings{magen2023neurips-initializationdependent,
  title     = {{Initialization-Dependent Sample Complexity of Linear Predictors and Neural Networks}},
  author    = {Magen, Roey and Shamir, Ohad},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/magen2023neurips-initializationdependent/}
}