Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits
Abstract
We introduce an anytime algorithm for stochastic multi-armed bandit with optimal distribution free and distribution dependent bounds (for a specific family of parameters). The performances of this algorithm (as well as another one motivated by the conjectured optimal bound) are evaluated empirically. A similar analysis is provided with full information, to serve as a benchmark.
Cite
Text
Degenne and Perchet. "Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits." International Conference on Machine Learning, 2016.Markdown
[Degenne and Perchet. "Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/degenne2016icml-anytime/)BibTeX
@inproceedings{degenne2016icml-anytime,
title = {{Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits}},
author = {Degenne, Rémy and Perchet, Vianney},
booktitle = {International Conference on Machine Learning},
year = {2016},
pages = {1587-1595},
volume = {48},
url = {https://mlanthology.org/icml/2016/degenne2016icml-anytime/}
}