Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity
Abstract
Multi-agent reinforcement learning (MARL) has primarily focused on solving a single task in isolation, while in practice the environment is often evolving, leaving many related tasks to be solved. In this paper, we investigate the benefits of meta-learning in solving multiple MARL tasks collectively. We establish the first line of theoretical results for meta-learning in a wide range of fundamental MARL settings, including learning Nash equilibria in two-player zero-sum Markov games and Markov potential games, as well as learning coarse correlated equilibria in general-sum Markov games. Under natural notions of task similarity, we show that meta-learning achieves provable sharper convergence to various game-theoretical solution concepts than learning each task separately. As an important intermediate step, we develop multiple MARL algorithms with initialization-dependent convergence guarantees. Such algorithms integrate optimistic policy mirror descents with stage-based value updates, and their refined convergence guarantees (nearly) recover the best known results even when a good initialization is unknown. To our best knowledge, such results are also new and might be of independent interest. We further provide numerical simulations to corroborate our theoretical findings.
Cite
Text
Mao et al. "Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity." Neural Information Processing Systems, 2023.Markdown
[Mao et al. "Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/mao2023neurips-multiagent/)BibTeX
@inproceedings{mao2023neurips-multiagent,
title = {{Multi-Agent Meta-Reinforcement Learning: Sharper Convergence Rates with Task Similarity}},
author = {Mao, Weichao and Qiu, Haoran and Wang, Chen and Franke, Hubertus and Kalbarczyk, Zbigniew and Iyer, Ravishankar and Basar, Tamer},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/mao2023neurips-multiagent/}
}