Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks
Abstract
We propose Multi-Level Local SGD, a distributed stochastic gradient method for learning a smooth, non-convex objective in a multi-level communication network with heterogeneous workers. Our network model consists of a set of disjoint sub-networks, with a single hub and multiple workers; further, workers may have different operating rates. The hubs exchange information with one another via a connected, but not necessarily complete communication network. In our algorithm, sub-networks execute a distributed SGD algorithm, using a hub-and-spoke paradigm, and the hubs periodically average their models with neighboring hubs. We first provide a unified mathematical framework that describes the Multi-Level Local SGD algorithm. We then present a theoretical analysis of the algorithm; our analysis shows the dependence of the convergence error on the worker node heterogeneity, hub network topology, and the number of local, sub-network, and global iterations. We illustrate the effectiveness of our algorithm in a multi-level network with slow workers via simulation-based experiments.
Cite
Text
Castiglia et al. "Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks." International Conference on Learning Representations, 2021.Markdown
[Castiglia et al. "Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/castiglia2021iclr-multilevel/)BibTeX
@inproceedings{castiglia2021iclr-multilevel,
title = {{Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks}},
author = {Castiglia, Timothy and Das, Anirban and Patterson, Stacy},
booktitle = {International Conference on Learning Representations},
year = {2021},
url = {https://mlanthology.org/iclr/2021/castiglia2021iclr-multilevel/}
}