Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence

Abstract

Distributed optimization is the standard way of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods. Yet, there are settings where some computationally-bounded nodes may not be able to implement first-order, gradient-based optimization, while they could still contribute to joint optimization tasks. In this paper, we initiate the study of hybrid decentralized optimization, studying settings where nodes with zeroth-order and first-order optimization capabilities co-exist in a distributed system, and attempt to jointly solve an optimization task over some data distribution. We essentially show that, under reasonable parameter settings, such a system can not only withstand noisier zeroth-order agents but can even benefit from integrating such agents into the optimization process, rather than ignoring their information. At the core of our approach is a new analysis of distributed optimization with noisy and possibly-biased gradient estimators, which may be of independent interest. Our results hold for both convex and non-convex objectives. Experimental results on standard optimization tasks confirm our analysis, showing that hybrid first-zeroth order optimization can be practical, even when training deep neural networks.

Cite

Text

Talaei et al. "Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I19.34290

Markdown

[Talaei et al. "Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/talaei2025aaai-hybrid/) doi:10.1609/AAAI.V39I19.34290

BibTeX

@inproceedings{talaei2025aaai-hybrid,
  title     = {{Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence}},
  author    = {Talaei, Shayan and Ansaripour, Matin and Nadiradze, Giorgi and Alistarh, Dan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {20778-20786},
  doi       = {10.1609/AAAI.V39I19.34290},
  url       = {https://mlanthology.org/aaai/2025/talaei2025aaai-hybrid/}
}