A New Regret Analysis for Adam-Type Algorithms

Abstract

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSGrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $\beta_{1}$ (typically between $0.9$ and $0.99$). In theory, regret guarantees for online convex optimization require a rapidly decaying $\beta_{1}\to0$ schedule. We show that this is an artifact of the standard analysis, and we propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $\beta_{1}$, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

Cite

Text

Alacaoglu et al. "A New Regret Analysis for Adam-Type Algorithms." International Conference on Machine Learning, 2020.

Markdown

[Alacaoglu et al. "A New Regret Analysis for Adam-Type Algorithms." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/alacaoglu2020icml-new/)

BibTeX

@inproceedings{alacaoglu2020icml-new,
  title     = {{A New Regret Analysis for Adam-Type Algorithms}},
  author    = {Alacaoglu, Ahmet and Malitsky, Yura and Mertikopoulos, Panayotis and Cevher, Volkan},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {202-210},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/alacaoglu2020icml-new/}
}