Multi-Thresholding Good Arm Identification with Bandit Feedback

Abstract

We consider a good arm identification problem in a stochastic bandit setting with multi-objectives, where each arm $i\\in[K]$ is associated with a distribution $\\mathcal\{D_\{i\}\}$ defined over $\\mathbb\{R\}^M$. For each round $t$, the player/algorithm pulls one arm $i_t$ and receives a $M$ dimensional vector feedback sampled according to $\\mathcal\{D_\{i_t\}\}$. The target is twofold, one is finding one arm whose means are larger than the predefined thresholds $\\xi_1,\\ldots,\\xi_M$ with a confidence bound $\\delta$ and an accuracy rate $\\epsilon$ with a bounded sample complexity, the other is output $\\bot$ to indicate no such arm exists. We propose an algorithm with a sample complexity bound. Our bound is the same as the one given in the previous work when $M=1$ and $\\epsilon = 0$, and we give novel bounds for $M > 1$ and $\\epsilon > 0$. The proposed algorithm attains better numerical performance than other baselines in the experiments on synthetic and real datasets.

Cite

Text

Jiang et al. "Multi-Thresholding Good Arm Identification with Bandit Feedback." Proceedings of the 17th Asian Conference on Machine Learning, 2025.

Markdown

[Jiang et al. "Multi-Thresholding Good Arm Identification with Bandit Feedback." Proceedings of the 17th Asian Conference on Machine Learning, 2025.](https://mlanthology.org/acml/2025/jiang2025acml-multithresholding/)

BibTeX

@inproceedings{jiang2025acml-multithresholding,
  title     = {{Multi-Thresholding Good Arm Identification with Bandit Feedback}},
  author    = {Jiang, Xuanke and Hashima, Sherief and Hatano, Kohei and Takimoto, Eiji},
  booktitle = {Proceedings of the 17th Asian Conference on Machine Learning},
  year      = {2025},
  pages     = {49-64},
  volume    = {304},
  url       = {https://mlanthology.org/acml/2025/jiang2025acml-multithresholding/}
}