MARINA Meets Matrix Stepsizes: Variance Reduced Distributed Non-Convex Optimization

Abstract

Matrix-stepsized gradient descent algorithms have been demonstrated to exhibit superior efficiency in non-convex optimization compared to their scalar counterparts. The det-CGD algorithm, as introduced by [LKR23], leverages matrix stepsizes to perform compressed gradient descent for non-convex objectives and matrix-smooth problems in a federated manner. The authors establish the algorithm's convergence to a neighborhood of the weighted stationarity point under a convex condition for the symmetric and positive-definite stepsize matrix. In this paper, we propose a variance-reduced version of the det-CGD algorithm, incorporating the MARINA method. Notably, we establish theoretically and empirically, that det-MARINA outperforms both MARINA and the distributed MARINA algorithms

Cite

Text

Li et al. "MARINA Meets Matrix Stepsizes: Variance Reduced Distributed Non-Convex Optimization." NeurIPS 2023 Workshops: Federated_Learning, 2023.

Markdown

[Li et al. "MARINA Meets Matrix Stepsizes: Variance Reduced Distributed Non-Convex Optimization." NeurIPS 2023 Workshops: Federated_Learning, 2023.](https://mlanthology.org/neuripsw/2023/li2023neuripsw-marina/)

BibTeX

@inproceedings{li2023neuripsw-marina,
  title     = {{MARINA Meets Matrix Stepsizes: Variance Reduced Distributed Non-Convex Optimization}},
  author    = {Li, Hanmin and Karagulyan, Avetik and Richtárik, Peter},
  booktitle = {NeurIPS 2023 Workshops: Federated_Learning},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/li2023neuripsw-marina/}
}