Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Abstract

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of $\widetilde{O}(T^{5/6})$, while the best known lower bound is $\Omega(T^{1/2})$. Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of $\widetilde{O}(T^{2/3})$. We present an efficient algorithm for the banditsmooth convex optimization problem that guarantees a regret of $\widetilde{O}(T^{5/8})$. Our result rules out an $\Omega(T^{2/3})$ lower bound and takes a significant step towards the resolution of this open problem.

Cite

Text

Dekel et al. "Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff." Neural Information Processing Systems, 2015.

Markdown

[Dekel et al. "Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff." Neural Information Processing Systems, 2015.](https://mlanthology.org/neurips/2015/dekel2015neurips-bandit/)

BibTeX

@inproceedings{dekel2015neurips-bandit,
  title     = {{Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff}},
  author    = {Dekel, Ofer and Eldan, Ronen and Koren, Tomer},
  booktitle = {Neural Information Processing Systems},
  year      = {2015},
  pages     = {2926-2934},
  url       = {https://mlanthology.org/neurips/2015/dekel2015neurips-bandit/}
}