Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing

Abstract

It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models. This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem, whose goal is to recover a low-rank ground-truth matrix from near-isotropic linear measurements. It is shown that GD with small initialization behaves similarly to the greedy low-rank learning heuristics and follows an incremental learning procedure: GD sequentially learns solutions with increasing ranks until it recovers the ground truth matrix. Compared to existing works which only analyze the first learning phase for rank-1 solutions, our result provides characterizations for the whole learning process. Moreover, besides the over-parameterized regime that many prior works focused on, our analysis of the incremental learning procedure also applies to the under-parameterized regime. Finally, we conduct numerical experiments to confirm our theoretical findings.

Cite

Text

Jin et al. "Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing." International Conference on Machine Learning, 2023.

Markdown

[Jin et al. "Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/jin2023icml-understanding/)

BibTeX

@inproceedings{jin2023icml-understanding,
  title     = {{Understanding Incremental Learning of Gradient Descent: A Fine-Grained Analysis of Matrix Sensing}},
  author    = {Jin, Jikai and Li, Zhiyuan and Lyu, Kaifeng and Du, Simon Shaolei and Lee, Jason D.},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {15200-15238},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/jin2023icml-understanding/}
}