Near-Interpolators: Rapid Norm Growth and the Trade-Off Between Interpolation and Generalization
Abstract
We study the generalization capability of nearly-interpolating linear regressors: ${\beta}$’s whose training error $\tau$ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix ${\Sigma}$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $\tau$ fixed, ${\beta}$ has squared $\ell_2$-norm $\mathbb{E}[\|{{\beta}}\|_{2}^{2}] = \Omega(n^{\alpha})$ where $n$ is the number of samples and $\alpha >1$ is the exponent of the eigendecay, i.e., $\lambda_i({\Sigma}) \sim i^{-\alpha}$. This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents $\alpha$ correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.
Cite
Text
Wang et al. "Near-Interpolators: Rapid Norm Growth and the Trade-Off Between Interpolation and Generalization." Artificial Intelligence and Statistics, 2024.Markdown
[Wang et al. "Near-Interpolators: Rapid Norm Growth and the Trade-Off Between Interpolation and Generalization." Artificial Intelligence and Statistics, 2024.](https://mlanthology.org/aistats/2024/wang2024aistats-nearinterpolators/)BibTeX
@inproceedings{wang2024aistats-nearinterpolators,
title = {{Near-Interpolators: Rapid Norm Growth and the Trade-Off Between Interpolation and Generalization}},
author = {Wang, Yutong and Sonthalia, Rishi and Hu, Wei},
booktitle = {Artificial Intelligence and Statistics},
year = {2024},
pages = {4483-4491},
volume = {238},
url = {https://mlanthology.org/aistats/2024/wang2024aistats-nearinterpolators/}
}