SAdam: A Variant of Adam for Strongly Convex Functions

Wang, Guanghui; Lu, Shiyin; Tu, Weiwei; Zhang, Lijun

SAdam: A Variant of Adam for Strongly Convex Functions

Guanghui Wang, Shiyin Lu, Weiwei Tu, Lijun Zhang

ICLR 2020

/iclr/2020/wang2020iclr-sadam/

Abstract

The Adam algorithm has become extremely popular for large-scale machine learning. Under convexity condition, it has been proved to enjoy a data-dependent $O(\sqrt{T})$ regret bound where $T$ is the time horizon. However, whether strong convexity can be utilized to further improve the performance remains an open problem. In this paper, we give an affirmative answer by developing a variant of Adam (referred to as SAdam) which achieves a data-dependent $O(\log T)$ regret bound for strongly convex functions. The essential idea is to maintain a faster decaying yet under controlled step size for exploiting strong convexity. In addition, under a special configuration of hyperparameters, our SAdam reduces to SC-RMSprop, a recently proposed variant of RMSprop for strongly convex functions, for which we provide the first data-dependent logarithmic regret bound. Empirical results on optimizing strongly convex functions and training deep networks demonstrate the effectiveness of our method.

PDF ICLR Semantic Scholar

Cite

Text

Wang et al. "SAdam: A Variant of Adam for Strongly Convex Functions." International Conference on Learning Representations, 2020.

Markdown

[Wang et al. "SAdam: A Variant of Adam for Strongly Convex Functions." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/wang2020iclr-sadam/)

BibTeX

@inproceedings{wang2020iclr-sadam,
  title     = {{SAdam: A Variant of Adam for Strongly Convex Functions}},
  author    = {Wang, Guanghui and Lu, Shiyin and Tu, Weiwei and Zhang, Lijun},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/wang2020iclr-sadam/}
}