The Batch Complexity of Bandit Pure Exploration

ICML 2025 pp. 60442-60468

/icml/2025/tuynman2025icml-batch/

Abstract

In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations. We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task. We then give a general batched algorithm and prove upper bounds on its expected sample complexity and batch complexity. We illustrate both lower and upper bounds on best-arm identification and thresholding bandits.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Tuynman and Degenne. "The Batch Complexity of Bandit Pure Exploration." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Tuynman and Degenne. "The Batch Complexity of Bandit Pure Exploration." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/tuynman2025icml-batch/)

BibTeX

@inproceedings{tuynman2025icml-batch,
  title     = {{The Batch Complexity of Bandit Pure Exploration}},
  author    = {Tuynman, Adrienne and Degenne, Rémy},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {60442-60468},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/tuynman2025icml-batch/}
}