Polynomial Cost of Adaptation for X-Armed Bandits

Abstract

In the context of stochastic continuum-armed bandits, we present an algorithm that adapts to the unknown smoothness of the objective function. We exhibit and compute a polynomial cost of adaptation to the Hölder regularity for regret minimization. To do this, we first reconsider the recent lower bound of Locatelli and Carpentier, 2018, and define and characterize admissible rate functions. Our new algorithm matches any of these minimal rate functions. We provide a finite-time analysis and a thorough discussion about asymptotic optimality.

Cite

Text

Hadiji. "Polynomial Cost of Adaptation for X-Armed Bandits." Neural Information Processing Systems, 2019.

Markdown

[Hadiji. "Polynomial Cost of Adaptation for X-Armed Bandits." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/hadiji2019neurips-polynomial/)

BibTeX

@inproceedings{hadiji2019neurips-polynomial,
  title     = {{Polynomial Cost of Adaptation for X-Armed Bandits}},
  author    = {Hadiji, Hedi},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {1029-1038},
  url       = {https://mlanthology.org/neurips/2019/hadiji2019neurips-polynomial/}
}