ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization
Abstract
Stochastic gradient descent (SGD) is a widely used method for its outstanding generalization ability and simplicity. Adaptive gradient methods have been proposed to further accelerate the optimization process. In this paper, we revisit existing adaptive gradient optimization methods with a new interpretation. Such new perspective leads to a refreshed understanding of the roles of second moments in stochastic optimization. Based on this, we propose Angle-Calibration Moment method (ACMo), a novel stochastic optimization method. It enjoys the benefits of second moments with only first moment updates. Theoretical analysis shows that ACMo is able to achieve the same convergence rate as mainstream adaptive methods. Experiments on a variety of CV and NLP tasks demonstrate that ACMo has a comparable convergence to state-of-the-art Adam-type optimizers, and even a better generalization performance in most cases. The code is available at https://github.com/Xunpeng746/ACMo.
Cite
Text
Huang et al. "ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I9.16959Markdown
[Huang et al. "ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/huang2021aaai-acmo/) doi:10.1609/AAAI.V35I9.16959BibTeX
@inproceedings{huang2021aaai-acmo,
title = {{ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization}},
author = {Huang, Xunpeng and Xu, Runxin and Zhou, Hao and Wang, Zhe and Liu, Zhengyang and Li, Lei},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {7857-7864},
doi = {10.1609/AAAI.V35I9.16959},
url = {https://mlanthology.org/aaai/2021/huang2021aaai-acmo/}
}