Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents
Abstract
In this paper, we present the Federated Upper Confidence Bound Value Iteration algorithm ($\texttt{Fed-UCBVI}$), a novel extension of the $\texttt{UCBVI}$ algorithm (Azar et al., 2017) tailored for the federated learning framework. We prove that the regret of $\texttt{Fed-UCBVI}$ scales as $\tilde O(\sqrt{H^3 |S| |A| T / M})$, with a small additional term due to heterogeneity, where $|S|$ is the number of states, $|A|$ is the number of actions, $H$ is the episode length, $M$ is the number of agents, and $T$ is the number of episodes. Notably, in the single-agent setting, this upper bound matches the minimax lower bound up to polylogarithmic factors, while in the multi-agent scenario, $\texttt{Fed-UCBVI}$ has linear speed-up. To conduct our analysis, we introduce a new measure of heterogeneity, which may hold independent theoretical interest. Furthermore, we show that, unlike existing federated reinforcement learning approaches, $\texttt{Fed-UCBVI}$’s communication complexity only marginally increases with the number of agents.
Cite
Text
Labbi et al. "Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.Markdown
[Labbi et al. "Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/labbi2025aistats-federated/)BibTeX
@inproceedings{labbi2025aistats-federated,
title = {{Federated UCBVI: Communication-Efficient Federated Regret Minimization with Heterogeneous Agents}},
author = {Labbi, Safwan and Tiapkin, Daniil and Mancini, Lorenzo and Mangold, Paul and Moulines, Eric},
booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
year = {2025},
pages = {1315-1323},
volume = {258},
url = {https://mlanthology.org/aistats/2025/labbi2025aistats-federated/}
}