Comparator-Adaptive Convex Bandits

Abstract

We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.

Cite

Text

van der Hoeven et al. "Comparator-Adaptive Convex Bandits." Neural Information Processing Systems, 2020.

Markdown

[van der Hoeven et al. "Comparator-Adaptive Convex Bandits." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/vanderhoeven2020neurips-comparatoradaptive/)

BibTeX

@inproceedings{vanderhoeven2020neurips-comparatoradaptive,
  title     = {{Comparator-Adaptive Convex Bandits}},
  author    = {van der Hoeven, Dirk and Cutkosky, Ashok and Luo, Haipeng},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/vanderhoeven2020neurips-comparatoradaptive/}
}