A Second-Order Method for Stochastic Bandit Convex Optimisation
Abstract
We introduce a simple and efficient algorithm for unconstrained zeroth-order stochastic convex bandits and prove its regret is at most (1 + r/d)[d^1.5 sqrt(n) + d^3] polylog(n, d, r) where n is the horizon, d the dimension and r is the radius of a known ball containing the minimiser of the loss.
Cite
Text
Lattimore and György. "A Second-Order Method for Stochastic Bandit Convex Optimisation." Conference on Learning Theory, 2023.Markdown
[Lattimore and György. "A Second-Order Method for Stochastic Bandit Convex Optimisation." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/lattimore2023colt-secondorder/)BibTeX
@inproceedings{lattimore2023colt-secondorder,
title = {{A Second-Order Method for Stochastic Bandit Convex Optimisation}},
author = {Lattimore, Tor and György, András},
booktitle = {Conference on Learning Theory},
year = {2023},
pages = {2067-2094},
volume = {195},
url = {https://mlanthology.org/colt/2023/lattimore2023colt-secondorder/}
}