SAdam: A Variant of Adam for Strongly Convex Functions

Abstract

The Adam algorithm has become extremely popular for large-scale machine learning. Under convexity condition, it has been proved to enjoy a data-dependent $O(\sqrt{T})$ regret bound where $T$ is the time horizon. However, whether strong convexity can be utilized to further improve the performance remains an open problem. In this paper, we give an affirmative answer by developing a variant of Adam (referred to as SAdam) which achieves a data-dependent $O(\log T)$ regret bound for strongly convex functions. The essential idea is to maintain a faster decaying yet under controlled step size for exploiting strong convexity. In addition, under a special configuration of hyperparameters, our SAdam reduces to SC-RMSprop, a recently proposed variant of RMSprop for strongly convex functions, for which we provide the first data-dependent logarithmic regret bound. Empirical results on optimizing strongly convex functions and training deep networks demonstrate the effectiveness of our method.

Cite

Text

Wang et al. "SAdam: A Variant of Adam for Strongly Convex Functions." International Conference on Learning Representations, 2020.

Markdown

[Wang et al. "SAdam: A Variant of Adam for Strongly Convex Functions." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/wang2020iclr-sadam/)

BibTeX

@inproceedings{wang2020iclr-sadam,
  title     = {{SAdam: A Variant of Adam for Strongly Convex Functions}},
  author    = {Wang, Guanghui and Lu, Shiyin and Tu, Weiwei and Zhang, Lijun},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/wang2020iclr-sadam/}
}