Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods

Abstract

We consider derivative-free algorithms for stochastic optimization problems that use only noisy function values rather than gradients, analyzing their finite-sample convergence rates. We show that if pairs of function values are available, algorithms that use gradient estimates based on random perturbations suffer a factor of at most $\sqrt{\dim}$ in convergence rate over traditional stochastic gradient methods, where $\dim$ is the dimension of the problem. We complement our algorithmic development with information-theoretic lower bounds on the minimax convergence rate of such problems, which show that our bounds are sharp with respect to all problem-dependent quantities: they cannot be improved by more than constant factors.

Cite

Text

Wibisono et al. "Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods." Neural Information Processing Systems, 2012.

Markdown

[Wibisono et al. "Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods." Neural Information Processing Systems, 2012.](https://mlanthology.org/neurips/2012/wibisono2012neurips-finite/)

BibTeX

@inproceedings{wibisono2012neurips-finite,
  title     = {{Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods}},
  author    = {Wibisono, Andre and Wainwright, Martin J. and Jordan, Michael I. and Duchi, John C.},
  booktitle = {Neural Information Processing Systems},
  year      = {2012},
  pages     = {1439-1447},
  url       = {https://mlanthology.org/neurips/2012/wibisono2012neurips-finite/}
}