An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

Abstract

We consider the closely related problems of bandit convex optimization with two-point feedback, and zero-order stochastic convex optimization with two function evaluations per round. We provide a simple algorithm and analysis which is optimal for convex Lipschitz functions. This improves on Duchi et al. (2015), which only provides an optimal result for smooth functions; Moreover, the algorithm and analysis are simpler, and readily extend to non-Euclidean problems. The algorithm is based on a small but surprisingly powerful modification of the gradient estimator.

Cite

Text

Shamir. "An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback." Journal of Machine Learning Research, 2017.

Markdown

[Shamir. "An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback." Journal of Machine Learning Research, 2017.](https://mlanthology.org/jmlr/2017/shamir2017jmlr-optimal/)

BibTeX

@article{shamir2017jmlr-optimal,
  title     = {{An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback}},
  author    = {Shamir, Ohad},
  journal   = {Journal of Machine Learning Research},
  year      = {2017},
  pages     = {1-11},
  volume    = {18},
  url       = {https://mlanthology.org/jmlr/2017/shamir2017jmlr-optimal/}
}