Near-Interpolators: Fast Norm Growth and Tempered Near-Overfitting

Abstract

We study linear regression when the input data population covariance matrix has eigenvalues $\lambda_i \sim i^{-\alpha}$ where $\alpha > 1$. Under a generic random matrix theory assumption, we prove that any near-interpolator, i.e., ${\beta}$ whose training error is below the noise floor, must have its squared $\ell_2$-norm growing super-linearly with the number of samples $n$: $\|{\beta}\|_{2}^{2} = \Omega(n^{\alpha})$. This implies that existing norm-based generalization bounds increase as the number of samples increases, matching the empirical observations from prior work. On the other hand, such near-interpolators when properly tuned achieve good generalization, where the test errors approach arbitrarily close to the noise floor. Our work demonstrates that existing norm-based generalization bounds are vacuous for explaining the generalization capability of \emph{any} near-interpolators. Moreover, we show that the trade-off between train and test accuracy is better when the norm growth exponential is smaller.

Cite

Text

Wang et al. "Near-Interpolators: Fast Norm Growth and Tempered Near-Overfitting." NeurIPS 2023 Workshops: M3L, 2023.

Markdown

[Wang et al. "Near-Interpolators: Fast Norm Growth and Tempered Near-Overfitting." NeurIPS 2023 Workshops: M3L, 2023.](https://mlanthology.org/neuripsw/2023/wang2023neuripsw-nearinterpolators/)

BibTeX

@inproceedings{wang2023neuripsw-nearinterpolators,
  title     = {{Near-Interpolators: Fast Norm Growth and Tempered Near-Overfitting}},
  author    = {Wang, Yutong and Sonthalia, Rishi and Hu, Wei},
  booktitle = {NeurIPS 2023 Workshops: M3L},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/wang2023neuripsw-nearinterpolators/}
}