Epsilon Best Arm Identification in Spectral Bandits
Abstract
We propose an analysis of Probably Approximately Correct (PAC) identification of an ε-best arm in graph bandit models with Gaussian distributions. We consider finite but potentially very large bandit models where the set of arms is endowed with a graph structure, and we assume that the arms’ expectations μ are smooth with respect to this graph. Our goal is to identify an arm whose expectation is at most ε below the largest of all means. We focus on the fixed-confidence setting: given a risk parameter δ, we consider sequential strategies that yield an ε-optimal arm with probability at least 1−δ. All such strategies use at least T ∗ R,ε(μ) log(1/δ) samples, where R is the smoothness parameter. We identify the complexity term T ∗ R,ε(μ) as the solution of a min-max problem for which we give a game-theoretic analysis and an approximation procedure. This procedure is the key element required by the asymptotically optimal Track-andStop strategy.
Cite
Text
Kocák and Garivier. "Epsilon Best Arm Identification in Spectral Bandits." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/363Markdown
[Kocák and Garivier. "Epsilon Best Arm Identification in Spectral Bandits." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/kocak2021ijcai-epsilon/) doi:10.24963/IJCAI.2021/363BibTeX
@inproceedings{kocak2021ijcai-epsilon,
title = {{Epsilon Best Arm Identification in Spectral Bandits}},
author = {Kocák, Tomás and Garivier, Aurélien},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2021},
pages = {2636-2642},
doi = {10.24963/IJCAI.2021/363},
url = {https://mlanthology.org/ijcai/2021/kocak2021ijcai-epsilon/}
}