Anytime Acceleration of Gradient Descent
Abstract
This work investigates stepsize-based acceleration of gradient descent with anytime convergence guarantees. For smooth (non-strongly) convex optimization, we propose a stepsize schedule that allows gradient descent to achieve convergence guarantees of $O\big(T^{-\frac{2\log_2\rho}{1+\log_2\rho}}\big) \approx O(T^{-1.119})$ for any stopping time $T$, where $\rho=\sqrt{2}+1$ is the silver ratio and the stepsize schedule is predetermined without prior knowledge of the stopping time. This result provides an affirmative answer to a COLT open problem regarding whether stepsize-based acceleration can yield anytime convergence rates of $o(T^{-1})$. We further extend our theory to yield anytime convergence guarantees of $\exp(-\Omega(T/\kappa^{0.893}))$ for smooth and strongly convex optimization, with $\kappa$ being the condition number.
Cite
Text
Zhang et al. "Anytime Acceleration of Gradient Descent." Proceedings of Thirty Eighth Conference on Learning Theory, 2025.Markdown
[Zhang et al. "Anytime Acceleration of Gradient Descent." Proceedings of Thirty Eighth Conference on Learning Theory, 2025.](https://mlanthology.org/colt/2025/zhang2025colt-anytime/)BibTeX
@inproceedings{zhang2025colt-anytime,
title = {{Anytime Acceleration of Gradient Descent}},
author = {Zhang, Zihan and Lee, Jason and Du, Simon and Chen, Yuxin},
booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory},
year = {2025},
pages = {5991-6013},
volume = {291},
url = {https://mlanthology.org/colt/2025/zhang2025colt-anytime/}
}