Noise Regularizes Over-Parameterized Rank One Matrix Recovery, Provably

Abstract

We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y^*\in R^{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. Specifically, we parameterize the rank one matrix $Y^*$ by $XX^\top$, where $X\in R^{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma^2/d)$, where $\sigma^2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma^2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks.

Cite

Text

Liu et al. " Noise Regularizes Over-Parameterized Rank One Matrix Recovery, Provably ." Artificial Intelligence and Statistics, 2022.

Markdown

[Liu et al. " Noise Regularizes Over-Parameterized Rank One Matrix Recovery, Provably ." Artificial Intelligence and Statistics, 2022.](https://mlanthology.org/aistats/2022/liu2022aistats-noise/)

BibTeX

@inproceedings{liu2022aistats-noise,
  title     = {{ Noise Regularizes Over-Parameterized Rank One Matrix Recovery, Provably }},
  author    = {Liu, Tianyi and Li, Yan and Zhou, Enlu and Zhao, Tuo},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2022},
  pages     = {2784-2802},
  volume    = {151},
  url       = {https://mlanthology.org/aistats/2022/liu2022aistats-noise/}
}