Multi-Agent Adversarial Inverse Reinforcement Learning

Abstract

Reinforcement learning agents are prone to undesired behaviors due to reward mis-specification. Finding a set of reward functions to properly guide agent behaviors is particularly challenging in multi-agent scenarios. Inverse reinforcement learning provides a framework to automatically acquire suitable reward functions from expert demonstrations. Its extension to multi-agent settings, however, is difficult due to the more complex notions of rational behaviors. In this paper, we propose MA-AIRL, a new framework for multi-agent inverse reinforcement learning, which is effective and scalable for Markov games with high-dimensional state-action space and unknown dynamics. We derive our algorithm based on a new solution concept and maximum pseudolikelihood estimation within an adversarial reward learning framework. In the experiments, we demonstrate that MA-AIRL can recover reward functions that are highly correlated with the ground truth rewards, while significantly outperforms prior methods in terms of policy imitation.

Cite

Text

Yu et al. "Multi-Agent Adversarial Inverse Reinforcement Learning." International Conference on Machine Learning, 2019.

Markdown

[Yu et al. "Multi-Agent Adversarial Inverse Reinforcement Learning." International Conference on Machine Learning, 2019.](https://mlanthology.org/icml/2019/yu2019icml-multiagent/)

BibTeX

@inproceedings{yu2019icml-multiagent,
  title     = {{Multi-Agent Adversarial Inverse Reinforcement Learning}},
  author    = {Yu, Lantao and Song, Jiaming and Ermon, Stefano},
  booktitle = {International Conference on Machine Learning},
  year      = {2019},
  pages     = {7194-7201},
  volume    = {97},
  url       = {https://mlanthology.org/icml/2019/yu2019icml-multiagent/}
}