Comparator-Adaptive Convex Bandits
Abstract
We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart. Specifically, we develop convex bandit algorithms with regret bounds that are small whenever the norm of the comparator is small. We first use techniques from the full-information setting to develop comparator-adaptive algorithms for linear bandits. Then, we extend the ideas to convex bandits with Lipschitz or smooth loss functions, using a new single-point gradient estimator and carefully designed surrogate losses.
Cite
Text
van der Hoeven et al. "Comparator-Adaptive Convex Bandits." Neural Information Processing Systems, 2020.Markdown
[van der Hoeven et al. "Comparator-Adaptive Convex Bandits." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/vanderhoeven2020neurips-comparatoradaptive/)BibTeX
@inproceedings{vanderhoeven2020neurips-comparatoradaptive,
title = {{Comparator-Adaptive Convex Bandits}},
author = {van der Hoeven, Dirk and Cutkosky, Ashok and Luo, Haipeng},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/vanderhoeven2020neurips-comparatoradaptive/}
}