General Stability Analysis for Zeroth-Order Optimization Algorithms

Abstract

Zeroth-order optimization algorithms are widely used for black-box optimization problems, such as those in machine learning and prompt engineering, where the gradients are approximated using function evaluations. Recently, a generalization result was provided for zeroth-order stochastic gradient descent (SGD) algorithms through stability analysis. However, this result was limited to the vanilla 2-point zeroth-order estimate of Gaussian distribution used in SGD algorithms. To address these limitations, we propose a general proof framework for stability analysis that applies to convex, strongly convex, and non-convex conditions, and yields results for popular zeroth-order optimization algorithms, including SGD, GD, and SVRG, as well as various zeroth-order estimates, such as 1-point and 2-point with different distributions and coordinate estimates. Our general analysis shows that coordinate estimation can lead to tighter generalization bounds for SGD, GD, and SVRG versions of zeroth-order optimization algorithms, due to the smaller expansion brought by coordinate estimates to stability analysis.

Cite

Text

Liu et al. "General Stability Analysis for Zeroth-Order Optimization Algorithms." International Conference on Learning Representations, 2024.

Markdown

[Liu et al. "General Stability Analysis for Zeroth-Order Optimization Algorithms." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/liu2024iclr-general/)

BibTeX

@inproceedings{liu2024iclr-general,
  title     = {{General Stability Analysis for Zeroth-Order Optimization Algorithms}},
  author    = {Liu, Xinyue and Zhang, Hualin and Gu, Bin and Chen, Hong},
  booktitle = {International Conference on Learning Representations},
  year      = {2024},
  url       = {https://mlanthology.org/iclr/2024/liu2024iclr-general/}
}