Improving Sharpness-Aware Minimization by Lookahead

Abstract

Sharpness-Aware Minimization (SAM), which performs gradient descent on adversarially perturbed weights, can improve generalization by identifying flatter minima. However, recent studies have shown that SAM may suffer from convergence instability and oscillate around saddle points, resulting in slow convergence and inferior performance. To address this problem, we propose the use of a lookahead mechanism to gather more information about the landscape by looking further ahead, and thus find a better trajectory to converge. By examining the nature of SAM, we simplify the extrapolation procedure, resulting in a more efficient algorithm. Theoretical results show that the proposed method converges to a stationary point and is less prone to saddle points. Experiments on standard benchmark datasets also verify that the proposed method outperforms the SOTAs, and converge more effectively to flat minima.

Cite

Text

Yu et al. "Improving Sharpness-Aware Minimization by Lookahead." International Conference on Machine Learning, 2024.

Markdown

[Yu et al. "Improving Sharpness-Aware Minimization by Lookahead." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/yu2024icml-improving/)

BibTeX

@inproceedings{yu2024icml-improving,
  title     = {{Improving Sharpness-Aware Minimization by Lookahead}},
  author    = {Yu, Runsheng and Zhang, Youzhi and Kwok, James},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {57776-57802},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/yu2024icml-improving/}
}