Conditional Random Fields for Multi-Agent Reinforcement Learning

Abstract

Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of observation and label pairs. Underlying all CRFs is the assumption that, conditioned on the training data, the labels are independent and identically distributed (iid). In this paper we explore the use of CRFs in a class of temporal learning algorithms, namely policygradient reinforcement learning (RL). Now the labels are no longer iid. They are actions that update the environment and affect the next observation. From an RL point of view, CRFs provide a natural way to model joint actions in a decentralized Markov decision process. They define how agents can communicate with each other to choose the optimal joint action. Our experiments include a synthetic network alignment problem, a distributed sensor network, and road traffc control; clearly outperforming RL methods which do not model the proper joint policy.

Cite

Text

Zhang et al. "Conditional Random Fields for Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273640

Markdown

[Zhang et al. "Conditional Random Fields for Multi-Agent Reinforcement Learning." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/zhang2007icml-conditional/) doi:10.1145/1273496.1273640

BibTeX

@inproceedings{zhang2007icml-conditional,
  title     = {{Conditional Random Fields for Multi-Agent Reinforcement Learning}},
  author    = {Zhang, Xinhua and Aberdeen, Douglas and Vishwanathan, S. V. N.},
  booktitle = {International Conference on Machine Learning},
  year      = {2007},
  pages     = {1143-1150},
  doi       = {10.1145/1273496.1273640},
  url       = {https://mlanthology.org/icml/2007/zhang2007icml-conditional/}
}