Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent

Abstract

Decentralized Stochastic Gradient Descent (D-SGD) represents an efficient communication approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel optimization paradigms, the incorporation of minibatch serves to diminish variance, consequently expediting the optimization process. Nevertheless, as per our current understanding, the existing literature has not thoroughly explored the learning theory foundation of Decentralized Minibatch Stochastic Gradient Descent (DM-SGD). In this paper, we try to address this theoretical gap by investigating the generalization properties of DM-SGD. We establish the sharper generalization bounds for the DM-SGD algorithm with replacement (without replacement) on (non)convex and (non)smooth cases. Moreover, our results consistently recover to the results of Centralized Stochastic Gradient Descent (C-SGD). In addition, we derive generalization analysis for Zero-Order (ZO) version of DM-SGD.

Cite

Text

Wang and Chen. "Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I14.29477

Markdown

[Wang and Chen. "Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/wang2024aaai-stability/) doi:10.1609/AAAI.V38I14.29477

BibTeX

@inproceedings{wang2024aaai-stability,
  title     = {{Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent}},
  author    = {Wang, Jiahuan and Chen, Hong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {15511-15519},
  doi       = {10.1609/AAAI.V38I14.29477},
  url       = {https://mlanthology.org/aaai/2024/wang2024aaai-stability/}
}