Implicit Ensemble Training for Efficient and Robust Multiagent Reinforcement Learning
Abstract
An important issue in competitive multiagent scenarios is the distribution mismatch between training and testing caused by variations in other agents' policies. As a result, policies optimized during training are typically sub-optimal (possibly very poor) in testing. Ensemble training is an effective approach for learning robust policies that avoid significant performance degradation when competing against previously unseen opponents. A large ensemble can improve diversity during the training, which leads to more robust learning. However, the computation and memory requirements increase linearly with respect to the ensemble size, which is not scalable as the ensemble size required for learning robust policy can be quite large. This paper proposes a novel parameterization of a policy ensemble based on a deep latent variable model with a multi-task network architecture, which represents an ensemble of policies implicitly within a single network. Our implicit ensemble training (IET) approach strikes a better trade-off between ensemble diversity and scalability compared to standard ensemble training. We demonstrate in several competitive multiagent scenarios in the board game and robotic domains that our new approach improves robustness against unseen adversarial opponents while achieving higher sample-efficiency and less computation.
Cite
Text
Shen and How. "Implicit Ensemble Training for Efficient and Robust Multiagent Reinforcement Learning." Transactions on Machine Learning Research, 2023.Markdown
[Shen and How. "Implicit Ensemble Training for Efficient and Robust Multiagent Reinforcement Learning." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/shen2023tmlr-implicit/)BibTeX
@article{shen2023tmlr-implicit,
title = {{Implicit Ensemble Training for Efficient and Robust Multiagent Reinforcement Learning}},
author = {Shen, Macheng and How, Jonathan P},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/shen2023tmlr-implicit/}
}