Improved Regret for Zeroth-Order Stochastic Convex Bandits

Lattimore, Tor; Gyorgy, Andras

Improved Regret for Zeroth-Order Stochastic Convex Bandits

COLT 2021 pp. 2938-2964

/colt/2021/lattimore2021colt-improved/

Abstract

We present an efficient algorithm for stochastic bandit convex optimisation with no assumptions on smoothness or strong convexity and for which the regret is bounded by O(d^(4.5) sqrt(n) polylog(n)), where n is the number of interactions and d is the dimension.

PDF COLT Semantic Scholar

Cite

Text

Lattimore and Gyorgy. "Improved Regret for Zeroth-Order Stochastic Convex Bandits." Conference on Learning Theory, 2021.

Markdown

[Lattimore and Gyorgy. "Improved Regret for Zeroth-Order Stochastic Convex Bandits." Conference on Learning Theory, 2021.](https://mlanthology.org/colt/2021/lattimore2021colt-improved/)

BibTeX

@inproceedings{lattimore2021colt-improved,
  title     = {{Improved Regret for Zeroth-Order Stochastic Convex Bandits}},
  author    = {Lattimore, Tor and Gyorgy, Andras},
  booktitle = {Conference on Learning Theory},
  year      = {2021},
  pages     = {2938-2964},
  volume    = {134},
  url       = {https://mlanthology.org/colt/2021/lattimore2021colt-improved/}
}