Towards Optimal Communication Complexity in Distributed Non-Convex Optimization

Abstract

We study the problem of distributed stochastic non-convex optimization with intermittent communication. We consider the full participation setting where $M$ machines work in parallel over $R$ communication rounds and the partial participation setting where $M$ machines are sampled independently every round from some meta-distribution over machines. We propose and analyze a new algorithm that improves existing methods by requiring fewer and lighter variance reduction operations. We also present lower bounds, showing our algorithm is either $\textit{optimal}$ or $\textit{almost optimal}$ in most settings. Numerical experiments demonstrate the superior performance of our algorithm.

Cite

Text

Patel et al. "Towards Optimal Communication Complexity in Distributed Non-Convex Optimization." Neural Information Processing Systems, 2022.

Markdown

[Patel et al. "Towards Optimal Communication Complexity in Distributed Non-Convex Optimization." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/patel2022neurips-optimal/)

BibTeX

@inproceedings{patel2022neurips-optimal,
  title     = {{Towards Optimal Communication Complexity in Distributed Non-Convex Optimization}},
  author    = {Patel, Kumar Kshitij and Wang, Lingxiao and Woodworth, Blake E and Bullins, Brian and Srebro, Nati},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/patel2022neurips-optimal/}
}