Meta-Learning Optimizers for Communication-Efficient Learning

Abstract

Communication-efficient variants of SGD, specifically local SGD, have received a great deal of interest in recent years. These approaches compute multiple gradient steps locally on each worker, before averaging model parameters, helping relieve the critical communication bottleneck in distributed deep learning training. Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art adaptive optimizers for deep learning. In this work, we investigate if the recent progress in the emerging area of learned optimizers can potentially close this gap in homogeneous data and homogeneous device settings while remaining communication-efficient. Specifically, we meta-learn how to perform global updates given an update from local SGD iterations. Our results demonstrate that learned optimizers can substantially outperform local SGD and its sophisticated variants while maintaining their communication efficiency. Our learned optimizers can even generalize to unseen and much larger datasets and architectures, including ImageNet and ViTs, and to unseen modalities such as language modeling. We therefore show the potential of learned optimizers for improving communication-efficient distributed learning.

Cite

Text

Joseph et al. "Meta-Learning Optimizers for Communication-Efficient Learning." Transactions on Machine Learning Research, 2025.

Markdown

[Joseph et al. "Meta-Learning Optimizers for Communication-Efficient Learning." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/joseph2025tmlr-metalearning/)

BibTeX

@article{joseph2025tmlr-metalearning,
  title     = {{Meta-Learning Optimizers for Communication-Efficient Learning}},
  author    = {Joseph, Charles-Étienne and Thérien, Benjamin and Moudgil, Abhinav and Knyazev, Boris and Belilovsky, Eugene},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/joseph2025tmlr-metalearning/}
}