Near-Optimal Regret Bounds for Federated Multi-Armed Bandits with Fully Distributed Communication

Haoran Zhang, Xuchuang Wang, Hao-Xu Chen, Hao Qiu, Lin Yang, Yang Gao

UAI 2025 pp. 4959-4981

/uai/2025/zhang2025uai-nearoptimal/

Abstract

In this paper, we focus on the research of federated multi-armed bandit (FMAB) problems where agents can only communicate with their neighbors. All agents aim to solve a common multi-armed bandit (MAB) problem to minimize individual regrets, while group regret can also be minimized. In a federated bandit problem, an agent fails to estimate the global reward means of arms by only using local observations, and hence, the bandit learning algorithm usually adopts a consensus estimation strategy to address the heterogeneity. However, up to now, the existing algorithms with fully distributed communication graphs only achieved a suboptimal result for the problem. To address that, a fully distributed online consensus estimation algorithm (\texttt{CES}) is proposed to estimate the global mean without bias. Integrating this consensus estimator into a distributed successive elimination bandit algorithm framework yields our federated bandit algorithm. Our algorithm significantly improves both individual and group regrets over previous approaches, and we provide an in-depth analysis of the lower bound for this problem.

PDF UAI OpenReview Semantic Scholar

Cite

Text

Zhang et al. "Near-Optimal Regret Bounds for Federated Multi-Armed Bandits with Fully Distributed Communication." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.

Markdown

[Zhang et al. "Near-Optimal Regret Bounds for Federated Multi-Armed Bandits with Fully Distributed Communication." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.](https://mlanthology.org/uai/2025/zhang2025uai-nearoptimal/)

BibTeX

@inproceedings{zhang2025uai-nearoptimal,
  title     = {{Near-Optimal Regret Bounds for Federated Multi-Armed Bandits with Fully Distributed Communication}},
  author    = {Zhang, Haoran and Wang, Xuchuang and Chen, Hao-Xu and Qiu, Hao and Yang, Lin and Gao, Yang},
  booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence},
  year      = {2025},
  pages     = {4959-4981},
  volume    = {286},
  url       = {https://mlanthology.org/uai/2025/zhang2025uai-nearoptimal/}
}