Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling

AAAI 2021 pp. 7883-7890

doi:10.1609/AAAI.V35I9.16962 /aaai/2021/huo2021aaai-large/

Abstract

Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. Warmup is one of nontrivial techniques to stabilize the convergence of large batch training. However, warmup is an empirical method and it is still unknown whether there is a better algorithm with theoretical underpinnings. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. We prove the convergence of our algorithm by introducing a new fine-grained analysis of gradient-based methods. Furthermore, the new analysis also helps to understand two other empirical tricks, layer-wise adaptive rate scaling and linear learning rate scaling. We conduct extensive experiments and demonstrate that the proposed algorithm outperforms gradual warmup technique by a large margin and defeats the convergence of the state-of-the-art large-batch optimizer in training advanced deep neural networks (ResNet, DenseNet, MobileNet) on ImageNet dataset.

PDF AAAI Semantic Scholar

Cite

Text

Huo et al. "Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I9.16962

Markdown

[Huo et al. "Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/huo2021aaai-large/) doi:10.1609/AAAI.V35I9.16962

BibTeX

@inproceedings{huo2021aaai-large,
  title     = {{Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling}},
  author    = {Huo, Zhouyuan and Gu, Bin and Huang, Heng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {7883-7890},
  doi       = {10.1609/AAAI.V35I9.16962},
  url       = {https://mlanthology.org/aaai/2021/huo2021aaai-large/}
}