R-MADDPG for Partially Observable Environments and Limited Communication
Abstract
There are several real-world tasks that would benefit from applying multiagent reinforcement learning (MARL) algorithms, including the coordination among multiple agents such as self-driving cars or autonomous delivery drones. Real-world conditions are a challenging environment for multiagent systems due to the environment's partially observable, nonstationary nature. Moreover, if agents must share a limited resource (e.g., communication network bandwidth) they must all learn how to coordinate resource use. These aspects make learning very challenging. This paper introduces a deep recurrent multiagent actor-critic framework for handling multiagent coordination under partial observable settings and limited communication. We investigate the recurrency effects on the performance and communication use of a team of agents, and demonstrate that the resulting framework is capable of learning time-dependencies for not only sharing missing observations but also handling resource limitations. It gives rise to different communication patterns among agents, which still perform equivalently well as current multiagent actor-critic methods under fully observable settings.
Cite
Text
Wang et al. "R-MADDPG for Partially Observable Environments and Limited Communication." ICML 2019 Workshops: RL4RealLife, 2019.Markdown
[Wang et al. "R-MADDPG for Partially Observable Environments and Limited Communication." ICML 2019 Workshops: RL4RealLife, 2019.](https://mlanthology.org/icmlw/2019/wang2019icmlw-rmaddpg/)BibTeX
@inproceedings{wang2019icmlw-rmaddpg,
title = {{R-MADDPG for Partially Observable Environments and Limited Communication}},
author = {Wang, Rose E. and Everett, Michael and How, Jonathan P.},
booktitle = {ICML 2019 Workshops: RL4RealLife},
year = {2019},
url = {https://mlanthology.org/icmlw/2019/wang2019icmlw-rmaddpg/}
}