Simple Bayesian Algorithms for Best Arm Identification

Abstract

This paper considers the optimal adaptive allocation of measurement effort for identifying the best among a finite set of options or designs. An experimenter sequentially chooses designs to measure and observes noisy signals of their quality with the goal of confidently identifying the best design after a small number of measurements. Just as the multiarmed bandit problem crystallizes the tradeoff between exploration and exploitation, this “pure exploration” variant crystallizes the challenge of rapidly gathering information before committing to a final decision. The paper proposes several simple Bayesian algorithms for allocating measurement effort and, by characterizing fundamental asymptotic limits on the performance of any algorithm, formalizes a sense in which these seemingly naive algorithms are the best possible.

Cite

Text

Russo. "Simple Bayesian Algorithms for Best Arm Identification." Annual Conference on Computational Learning Theory, 2016. doi:10.1287/opre.2019.1911

Markdown

[Russo. "Simple Bayesian Algorithms for Best Arm Identification." Annual Conference on Computational Learning Theory, 2016.](https://mlanthology.org/colt/2016/russo2016colt-simple/) doi:10.1287/opre.2019.1911

BibTeX

@inproceedings{russo2016colt-simple,
  title     = {{Simple Bayesian Algorithms for Best Arm Identification}},
  author    = {Russo, Daniel},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2016},
  pages     = {1417-1418},
  doi       = {10.1287/opre.2019.1911},
  url       = {https://mlanthology.org/colt/2016/russo2016colt-simple/}
}