On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits
Abstract
We make three contributions to the theory of k-armed adversarial bandits. First, we prove a first-order bound for a modified variant of the INF strategy by Audibert and Bubeck [2009], without sacrificing worst case optimality or modifying the loss estimators. Second, we provide a variance analysis for algorithms based on follow the regularised leader, showing that without adaptation the variance of the regret is typically {\Omega}(n^2) where n is the horizon. Finally, we study bounds that depend on the degree of separation of the arms, generalising the results by Cowan and Katehakis [2015] from the stochastic setting to the adversarial and improving the result of Seldin and Slivkins [2014] by a factor of log(n)/log(log(n)).
Cite
Text
Pogodin and Lattimore. "On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits." Uncertainty in Artificial Intelligence, 2019.Markdown
[Pogodin and Lattimore. "On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits." Uncertainty in Artificial Intelligence, 2019.](https://mlanthology.org/uai/2019/pogodin2019uai-firstorder/)BibTeX
@inproceedings{pogodin2019uai-firstorder,
title = {{On First-Order Bounds, Variance and Gap-Dependent Bounds for Adversarial Bandits}},
author = {Pogodin, Roman and Lattimore, Tor},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2019},
pages = {894-904},
volume = {115},
url = {https://mlanthology.org/uai/2019/pogodin2019uai-firstorder/}
}