Optimistic Bandit Convex Optimization

Abstract

We introduce the general and powerful scheme of predicting information re-use in optimization algorithms. This allows us to devise a computationally efficient algorithm for bandit convex optimization with new state-of-the-art guarantees for both Lipschitz loss functions and loss functions with Lipschitz gradients. This is the first algorithm admitting both a polynomial time complexity and a regret that is polynomial in the dimension of the action space that improves upon the original regret bound for Lipschitz loss functions, achieving a regret of $\widetilde O(T^{11/16}d^{3/8})$. Our algorithm further improves upon the best existing polynomial-in-dimension bound (both computationally and in terms of regret) for loss functions with Lipschitz gradients, achieving a regret of $\widetilde O(T^{8/13} d^{5/3})$.

Cite

Text

Yang and Mohri. "Optimistic Bandit Convex Optimization." Neural Information Processing Systems, 2016.

Markdown

[Yang and Mohri. "Optimistic Bandit Convex Optimization." Neural Information Processing Systems, 2016.](https://mlanthology.org/neurips/2016/yang2016neurips-optimistic/)

BibTeX

@inproceedings{yang2016neurips-optimistic,
  title     = {{Optimistic Bandit Convex Optimization}},
  author    = {Yang, Scott and Mohri, Mehryar},
  booktitle = {Neural Information Processing Systems},
  year      = {2016},
  pages     = {2297-2305},
  url       = {https://mlanthology.org/neurips/2016/yang2016neurips-optimistic/}
}