Quantum Best Arm Identification with Quantum Oracles
Abstract
Best arm identification (BAI) is a key problem in stochastic multi-armed bandits, where K arms each has an associated reward distribution, and the objective is to minimize the number of queries needed to identify the best arm with high confidence. In this paper, we explore BAI using quantum oracles. For the case where each query probes only one arm (m=1), we devise a quantum algorithm with a query complexity upper bound of O((K/Delta)log(1/delta)), where delta is the confidence parameter and Delta is the reward gap between best and second best arms. This improves on the classical bound by a factor of 1/Delta. For the general case where a single query can probe m arms (1
Cite
Text
Wang et al. "Quantum Best Arm Identification with Quantum Oracles." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I20.35432Markdown
[Wang et al. "Quantum Best Arm Identification with Quantum Oracles." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-quantum/) doi:10.1609/AAAI.V39I20.35432BibTeX
@inproceedings{wang2025aaai-quantum,
title = {{Quantum Best Arm Identification with Quantum Oracles}},
author = {Wang, Xuchuang and Chen, Yu-Zhen Janice and de Andrade, Matheus Guedes and Allcock, Jonathan and Hajiesmaili, Mohammad and Lui, John C. S. and Towsley, Don},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {21321-21329},
doi = {10.1609/AAAI.V39I20.35432},
url = {https://mlanthology.org/aaai/2025/wang2025aaai-quantum/}
}