From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression
Abstract
We conduct a comprehensive investigation into the dynamics of gradient descent using large-order constant step-sizes in the context of quadratic regression models. Within this framework, we reveal that the dynamics can be encapsulated by a specific cubic map, naturally parameterized by the step-size. Through a fine-grained bifurcation analysis concerning the step-size parameter, we delineate five distinct training phases: (1) monotonic, (2) catapult, (3) periodic, (4) chaotic, and (5) divergent, precisely demarcating the boundaries of each phase. As illustrations, we provide examples involving phase retrieval and two-layer neural networks employing quadratic activation functions and constant outer-layers, utilizing orthogonal training data. Our simulations indicate that these five phases also manifest with generic non-orthogonal data. We also empirically investigate the generalization performance when training in the various non-monotonic (and non-divergent) phases. In particular, we observe that performing an ergodic trajectory averaging stabilizes the test error in non-monotonic (and non-divergent) phases.
Cite
Text
Chen et al. "From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression." Transactions on Machine Learning Research, 2024.Markdown
[Chen et al. "From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/chen2024tmlr-stability/)BibTeX
@article{chen2024tmlr-stability,
title = {{From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression}},
author = {Chen, Xuxing and Balasubramanian, Krishna and Ghosal, Promit and Agrawalla, Bhavya Kumar},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/chen2024tmlr-stability/}
}