Quantum Best Arm Identification with Quantum Oracles

Abstract

Best arm identification (BAI) is a key problem in stochastic multi-armed bandits, where K arms each has an associated reward distribution, and the objective is to minimize the number of queries needed to identify the best arm with high confidence. In this paper, we explore BAI using quantum oracles. For the case where each query probes only one arm (m=1), we devise a quantum algorithm with a query complexity upper bound of O((K/Delta)log(1/delta)), where delta is the confidence parameter and Delta is the reward gap between best and second best arms. This improves on the classical bound by a factor of 1/Delta. For the general case where a single query can probe m arms (1

Cite

Text

Wang et al. "Quantum Best Arm Identification with Quantum Oracles." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I20.35432

Markdown

[Wang et al. "Quantum Best Arm Identification with Quantum Oracles." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-quantum/) doi:10.1609/AAAI.V39I20.35432

BibTeX

@inproceedings{wang2025aaai-quantum,
  title     = {{Quantum Best Arm Identification with Quantum Oracles}},
  author    = {Wang, Xuchuang and Chen, Yu-Zhen Janice and de Andrade, Matheus Guedes and Allcock, Jonathan and Hajiesmaili, Mohammad and Lui, John C. S. and Towsley, Don},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {21321-21329},
  doi       = {10.1609/AAAI.V39I20.35432},
  url       = {https://mlanthology.org/aaai/2025/wang2025aaai-quantum/}
}