Asynchronous Decentralized SGD with Quantized and Local Updates

Abstract

Decentralized optimization is emerging as a viable alternative for scalable distributed machine learning, but also introduces new challenges in terms of synchronization costs. To this end, several communication-reduction techniques, such as non-blocking communication, quantization, and local steps, have been explored in the decentralized setting. Due to the complexity of analyzing optimization in such a relaxed setting, this line of work often assumes \emph{global} communication rounds, which require additional synchronization. In this paper, we consider decentralized optimization in the simpler, but harder to analyze, \emph{asynchronous gossip} model, in which communication occurs in discrete, randomly chosen pairings among nodes. Perhaps surprisingly, we show that a variant of SGD called \emph{SwarmSGD} still converges in this setting, even if \emph{non-blocking communication}, \emph{quantization}, and \emph{local steps} are all applied \emph{in conjunction}, and even if the node data distributions and underlying graph topology are both \emph{heterogenous}. Our analysis is based on a new connection with multi-dimensional load-balancing processes. We implement this algorithm and deploy it in a super-computing environment, showing that it can outperform previous decentralized methods in terms of end-to-end training time, and that it can even rival carefully-tuned large-batch SGD for certain tasks.

Cite

Text

Nadiradze et al. "Asynchronous Decentralized SGD with Quantized and Local Updates." Neural Information Processing Systems, 2021.

Markdown

[Nadiradze et al. "Asynchronous Decentralized SGD with Quantized and Local Updates." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/nadiradze2021neurips-asynchronous/)

BibTeX

@inproceedings{nadiradze2021neurips-asynchronous,
  title     = {{Asynchronous Decentralized SGD with Quantized and Local Updates}},
  author    = {Nadiradze, Giorgi and Sabour, Amirmojtaba and Davies, Peter and Li, Shigang and Alistarh, Dan},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/nadiradze2021neurips-asynchronous/}
}