Pure Exploration with Multiple Correct Answers

Degenne, Rémy; Koolen, Wouter M.

Pure Exploration with Multiple Correct Answers

NeurIPS 2019 pp. 14591-14600

/neurips/2019/degenne2019neurips-pure/

Abstract

We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensure that the existing Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.

PDF NeurIPS Semantic Scholar

Cite

Text

Degenne and Koolen. "Pure Exploration with Multiple Correct Answers." Neural Information Processing Systems, 2019.

Markdown

[Degenne and Koolen. "Pure Exploration with Multiple Correct Answers." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/degenne2019neurips-pure/)

BibTeX

@inproceedings{degenne2019neurips-pure,
  title     = {{Pure Exploration with Multiple Correct Answers}},
  author    = {Degenne, Rémy and Koolen, Wouter M.},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {14591-14600},
  url       = {https://mlanthology.org/neurips/2019/degenne2019neurips-pure/}
}