Optimistic Adaptive Acceleration for Optimization

Abstract

This paper considers a new variant of AMSGrad called Optimistic-AMSGrad. AMSGrad is a popular adaptive gradient based optimization algorithm that is widely used in training deep neural networks. The new variant assumes that mini-batch gradients in consecutive iterations have some underlying structure, which makes the gradients sequentially predictable. By exploiting the predictability and some ideas from Optimistic Online learning, the proposed algorithm can accelerate the convergence and also enjoys a tighter regret bound. We evaluate Optimistic-AMSGrad and AMSGrad in terms of various performance measures (i.e., training loss, testing loss, and classification accuracy on training/testing data), which demonstrate that Optimistic-AMSGrad improves AMSGrad.

Cite

Text

Wang et al. "Optimistic Adaptive Acceleration for Optimization." International Conference on Learning Representations, 2020.

Markdown

[Wang et al. "Optimistic Adaptive Acceleration for Optimization." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/wang2020iclr-optimistic/)

BibTeX

@inproceedings{wang2020iclr-optimistic,
  title     = {{Optimistic Adaptive Acceleration for Optimization}},
  author    = {Wang, Jun-Kun and Li, Xiaoyun and Li, Ping},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/wang2020iclr-optimistic/}
}