General Stability Analysis for Zeroth-Order Optimization Algorithms
Abstract
Zeroth-order optimization algorithms are widely used for black-box optimization problems, such as those in machine learning and prompt engineering, where the gradients are approximated using function evaluations. Recently, a generalization result was provided for zeroth-order stochastic gradient descent (SGD) algorithms through stability analysis. However, this result was limited to the vanilla 2-point zeroth-order estimate of Gaussian distribution used in SGD algorithms. To address these limitations, we propose a general proof framework for stability analysis that applies to convex, strongly convex, and non-convex conditions, and yields results for popular zeroth-order optimization algorithms, including SGD, GD, and SVRG, as well as various zeroth-order estimates, such as 1-point and 2-point with different distributions and coordinate estimates. Our general analysis shows that coordinate estimation can lead to tighter generalization bounds for SGD, GD, and SVRG versions of zeroth-order optimization algorithms, due to the smaller expansion brought by coordinate estimates to stability analysis.
Cite
Text
Liu et al. "General Stability Analysis for Zeroth-Order Optimization Algorithms." International Conference on Learning Representations, 2024.Markdown
[Liu et al. "General Stability Analysis for Zeroth-Order Optimization Algorithms." International Conference on Learning Representations, 2024.](https://mlanthology.org/iclr/2024/liu2024iclr-general/)BibTeX
@inproceedings{liu2024iclr-general,
title = {{General Stability Analysis for Zeroth-Order Optimization Algorithms}},
author = {Liu, Xinyue and Zhang, Hualin and Gu, Bin and Chen, Hong},
booktitle = {International Conference on Learning Representations},
year = {2024},
url = {https://mlanthology.org/iclr/2024/liu2024iclr-general/}
}