Towards Optimal Communication Complexity in Distributed Non-Convex Optimization
Abstract
We study the problem of distributed stochastic non-convex optimization with intermittent communication. We consider the full participation setting where $M$ machines work in parallel over $R$ communication rounds and the partial participation setting where $M$ machines are sampled independently every round from some meta-distribution over machines. We propose and analyze a new algorithm that improves existing methods by requiring fewer and lighter variance reduction operations. We also present lower bounds, showing our algorithm is either $\textit{optimal}$ or $\textit{almost optimal}$ in most settings. Numerical experiments demonstrate the superior performance of our algorithm.
Cite
Text
Patel et al. "Towards Optimal Communication Complexity in Distributed Non-Convex Optimization." Neural Information Processing Systems, 2022.Markdown
[Patel et al. "Towards Optimal Communication Complexity in Distributed Non-Convex Optimization." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/patel2022neurips-optimal/)BibTeX
@inproceedings{patel2022neurips-optimal,
title = {{Towards Optimal Communication Complexity in Distributed Non-Convex Optimization}},
author = {Patel, Kumar Kshitij and Wang, Lingxiao and Woodworth, Blake E and Bullins, Brian and Srebro, Nati},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/patel2022neurips-optimal/}
}