Multi-Player Multi-Armed Bandits with Delayed Feedback
Abstract
Multi-player multi-armed bandits (MP-MAB) have been extensively studied due to their application in cognitive radio networks. In this setting, multiple players simultaneously select arms and instantly receive feedback. However, in realistic decentralized networks, feedback is often delayed due to sensing latency and signal processing. Without a central coordinator, explicit communication is impossible, and delayed feedback disrupts implicit coordination, since it depends on synchronous observations. As a result, collisions are frequent and system performance degrades significantly. In this paper, we propose an algorithm in MP-MAB with stochastic delay feedback. Each player in the algorithm independently maintains an estimate of the optimal arm set based on their own delayed rewards but only pulls arms from the set, which is, with high probability, identical to those of other players, thus avoiding collisions. The identical arm set also enables implicit communication, allowing players to utilize the exploration results of others. We establish a regret upper bound and derive a lower bound to prove the algorithm is near-optimal. Numerical experiments on both synthetic and real-world datasets validate the effectiveness of our algorithm.
Cite
Text
Fan et al. "Multi-Player Multi-Armed Bandits with Delayed Feedback." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/564Markdown
[Fan et al. "Multi-Player Multi-Armed Bandits with Delayed Feedback." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/fan2025ijcai-multi/) doi:10.24963/IJCAI.2025/564BibTeX
@inproceedings{fan2025ijcai-multi,
title = {{Multi-Player Multi-Armed Bandits with Delayed Feedback}},
author = {Fan, Jingqi and Wang, Zilong and Li, Shuai and Kong, Linghe},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {5065-5073},
doi = {10.24963/IJCAI.2025/564},
url = {https://mlanthology.org/ijcai/2025/fan2025ijcai-multi/}
}