An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
Abstract
The article focuses on distributed reinforcement learning in cooperative multiagent -decision-processes, where an ensemble of simultaneously and independently acting agents tries to maximize a discounted sum of rewards. We assume that each agent has no information about its teammates' behaviour. Thus, in contrast to single-agent reinforcement-learning each agent has to consider its teammates' behaviour and to nd a cooperative policy. We propose a model-free distributed Q-learning algorithm for cooperative multi-agent-decision-processes. It can be proved to nd optimal policies in deterministic environments. No additional expense is needed in comparison to the non-distributed case. Further there is no need for additional communication between the agents. 1. Introduction Reinforcement learning has originally been discussed for Markov Decision Processes (MDPs): a single agent has to learn a policy that maximizes the discounted sum of rewards in a stochastic environment...
Cite
Text
Lauer and Riedmiller. "An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems." International Conference on Machine Learning, 2000.Markdown
[Lauer and Riedmiller. "An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems." International Conference on Machine Learning, 2000.](https://mlanthology.org/icml/2000/lauer2000icml-algorithm/)BibTeX
@inproceedings{lauer2000icml-algorithm,
title = {{An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems}},
author = {Lauer, Martin and Riedmiller, Martin A.},
booktitle = {International Conference on Machine Learning},
year = {2000},
pages = {535-542},
url = {https://mlanthology.org/icml/2000/lauer2000icml-algorithm/}
}