Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games

Abstract

Recent success in cooperative multi-agent reinforcement learning (MARL) relies on centralized training and policy sharing. Centralized training eliminates the issue of non-stationarity MARL yet induces large communication costs, and policy sharing is empirically crucial to efficient learning in certain tasks yet lacks theoretical justification. In this paper, we formally characterize a subclass of cooperative Markov games where agents exhibit a certain level of homogeneity such that policy sharing provably incurs no suboptimality. This enables us to develop the first consensus-based decentralized actor-critic method where the consensus update is applied to both the actors and the critics while ensuring convergence. We also develop practical algorithms based on our decentralized actor-critic method to reduce the communication cost during training, while still yielding policies comparable with centralized training.

Cite

Text

Chen et al. "Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games." NeurIPS 2021 Workshops: DeepRL, 2021.

Markdown

[Chen et al. "Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/chen2021neuripsw-communicationefficient/)

BibTeX

@inproceedings{chen2021neuripsw-communicationefficient,
  title     = {{Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games}},
  author    = {Chen, Dingyang and Li, Yile and Zhang, Qi},
  booktitle = {NeurIPS 2021 Workshops: DeepRL},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/chen2021neuripsw-communicationefficient/}
}