Adaptivity to Smoothness in X-Armed Bandits
Abstract
We study the stochastic continuum-armed bandit problem from the angle of adaptivity to \emph{unknown regularity} of the reward function $f$. We prove that there exists no strategy for the cumulative regret that adapts optimally to the \emph{smoothness} of $f$. We show however that such minimax optimal adaptive strategies exist if the learner is given \emph{extra-information} about $f$. Finally, we complement our positive results with matching lower bounds.
Cite
Text
Locatelli and Carpentier. "Adaptivity to Smoothness in X-Armed Bandits." Annual Conference on Computational Learning Theory, 2018.Markdown
[Locatelli and Carpentier. "Adaptivity to Smoothness in X-Armed Bandits." Annual Conference on Computational Learning Theory, 2018.](https://mlanthology.org/colt/2018/locatelli2018colt-adaptivity/)BibTeX
@inproceedings{locatelli2018colt-adaptivity,
title = {{Adaptivity to Smoothness in X-Armed Bandits}},
author = {Locatelli, Andrea and Carpentier, Alexandra},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2018},
pages = {1463-1492},
url = {https://mlanthology.org/colt/2018/locatelli2018colt-adaptivity/}
}