Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents

Abstract

In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm ($\texttt{Fed-UCBVI}$), a novel extension of the $\texttt{UCBVI}$ algorithm (Azar et al., 2017) tailored for the federated learning framework. We prove that the regret of $\texttt{Fed-UCBVI}$ scales as $\tilde O(\sqrt{H^3 |S| |A| T / M})$, with a small additional term due to heterogeneity, where $|S|$ is the number of states, $|A|$ is the number of actions, $H$ is the episode length, $M$ is the number of agents, and $T$ is the number of episodes. Notably, in the single-agent setting, this upper bound matches the minimax lower bound up to polylogarithmic factors, while in the multi-agent scenario, $\texttt{Fed-UCBVI}$ has linear speed-up. To conduct our analysis, we introduce a new measure of heterogeneity, which may hold independent theoretical interest. Furthermore, we show that, unlike existing federated reinforcement learning approaches, $\texttt{Fed-UCBVI}$’s communication complexity only marginally increases with the number of agents.

Cite

Text

Labbi et al. "Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.

Markdown

[Labbi et al. "Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/labbi2025aistats-federated/)

BibTeX

@inproceedings{labbi2025aistats-federated,
  title     = {{Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents}},
  author    = {Labbi, Safwan and Tiapkin, Daniil and Mancini, Lorenzo and Mangold, Paul and Moulines, Eric},
  booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  year      = {2025},
  pages     = {1315-1323},
  volume    = {258},
  url       = {https://mlanthology.org/aistats/2025/labbi2025aistats-federated/}
}