Online Newton Method for Bandit Convex Optimisation Extended Abstract

Abstract

We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most $d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ with high probability where $d$ is the dimension and $n$ is the time horizon. In the stochastic setting the bound improves to $M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ where $M \in [d^{-1/2}, d^{-1/4}]$ is a constant that depends on the geometry of the constraint set and the desired computational properties.

Cite

Text

Fokkema et al. "Online Newton Method for Bandit Convex Optimisation Extended Abstract." Conference on Learning Theory, 2024.

Markdown

[Fokkema et al. "Online Newton Method for Bandit Convex Optimisation Extended Abstract." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/fokkema2024colt-online/)

BibTeX

@inproceedings{fokkema2024colt-online,
  title     = {{Online Newton Method for Bandit Convex Optimisation Extended Abstract}},
  author    = {Fokkema, Hidde and Hoeven, Dirk and Lattimore, Tor and Mayo, Jack J.},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {1713-1714},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/fokkema2024colt-online/}
}