Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits

Abstract

We introduce an anytime algorithm for stochastic multi-armed bandit with optimal distribution free and distribution dependent bounds (for a specific family of parameters). The performances of this algorithm (as well as another one motivated by the conjectured optimal bound) are evaluated empirically. A similar analysis is provided with full information, to serve as a benchmark.

Cite

Text

Degenne and Perchet. "Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits." International Conference on Machine Learning, 2016.

Markdown

[Degenne and Perchet. "Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits." International Conference on Machine Learning, 2016.](https://mlanthology.org/icml/2016/degenne2016icml-anytime/)

BibTeX

@inproceedings{degenne2016icml-anytime,
  title     = {{Anytime Optimal Algorithms in Stochastic Multi-Armed Bandits}},
  author    = {Degenne, Rémy and Perchet, Vianney},
  booktitle = {International Conference on Machine Learning},
  year      = {2016},
  pages     = {1587-1595},
  volume    = {48},
  url       = {https://mlanthology.org/icml/2016/degenne2016icml-anytime/}
}