A Second-Order Method for Stochastic Bandit Convex Optimisation

Abstract

We introduce a simple and efficient algorithm for unconstrained zeroth-order stochastic convex bandits and prove its regret is at most (1 + r/d)[d^1.5 sqrt(n) + d^3] polylog(n, d, r) where n is the horizon, d the dimension and r is the radius of a known ball containing the minimiser of the loss.

Cite

Text

Lattimore and György. "A Second-Order Method for Stochastic Bandit Convex Optimisation." Conference on Learning Theory, 2023.

Markdown

[Lattimore and György. "A Second-Order Method for Stochastic Bandit Convex Optimisation." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/lattimore2023colt-secondorder/)

BibTeX

@inproceedings{lattimore2023colt-secondorder,
  title     = {{A Second-Order Method for Stochastic Bandit Convex Optimisation}},
  author    = {Lattimore, Tor and György, András},
  booktitle = {Conference on Learning Theory},
  year      = {2023},
  pages     = {2067-2094},
  volume    = {195},
  url       = {https://mlanthology.org/colt/2023/lattimore2023colt-secondorder/}
}