Adaptive Proximal Gradient Method for Convex Optimization

Abstract

In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].

Cite

Text

Malitsky and Mishchenko. "Adaptive Proximal Gradient Method for Convex Optimization." Neural Information Processing Systems, 2024. doi:10.52202/079017-3193

Markdown

[Malitsky and Mishchenko. "Adaptive Proximal Gradient Method for Convex Optimization." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/malitsky2024neurips-adaptive/) doi:10.52202/079017-3193

BibTeX

@inproceedings{malitsky2024neurips-adaptive,
  title     = {{Adaptive Proximal Gradient Method for Convex Optimization}},
  author    = {Malitsky, Yura and Mishchenko, Konstantin},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-3193},
  url       = {https://mlanthology.org/neurips/2024/malitsky2024neurips-adaptive/}
}