An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
Abstract
We consider the closely related problems of bandit convex optimization with two-point feedback, and zero-order stochastic convex optimization with two function evaluations per round. We provide a simple algorithm and analysis which is optimal for convex Lipschitz functions. This improves on Duchi et al. (2015), which only provides an optimal result for smooth functions; Moreover, the algorithm and analysis are simpler, and readily extend to non-Euclidean problems. The algorithm is based on a small but surprisingly powerful modification of the gradient estimator.
Cite
Text
Shamir. "An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback." Journal of Machine Learning Research, 2017.Markdown
[Shamir. "An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback." Journal of Machine Learning Research, 2017.](https://mlanthology.org/jmlr/2017/shamir2017jmlr-optimal/)BibTeX
@article{shamir2017jmlr-optimal,
title = {{An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback}},
author = {Shamir, Ohad},
journal = {Journal of Machine Learning Research},
year = {2017},
pages = {1-11},
volume = {18},
url = {https://mlanthology.org/jmlr/2017/shamir2017jmlr-optimal/}
}