VAST: Value Function Factorization with Variable Agent Sub-Teams

Abstract

Value function factorization (VFF) is a popular approach to cooperative multi-agent reinforcement learning in order to learn local value functions from global rewards. However, state-of-the-art VFF is limited to a handful of agents in most domains. We hypothesize that this is due to the flat factorization scheme, where the VFF operator becomes a performance bottleneck with an increasing number of agents. Therefore, we propose VFF with variable agent sub-teams (VAST). VAST approximates a factorization for sub-teams which can be defined in an arbitrary way and vary over time, e.g., to adapt to different situations. The sub-team values are then linearly decomposed for all sub-team members. Thus, VAST can learn on a more focused and compact input representation of the original VFF operator. We evaluate VAST in three multi-agent domains and show that VAST can significantly outperform state-of-the-art VFF, when the number of agents is sufficiently large.

Cite

Text

Phan et al. "VAST: Value Function Factorization with Variable Agent Sub-Teams." Neural Information Processing Systems, 2021.

Markdown

[Phan et al. "VAST: Value Function Factorization with Variable Agent Sub-Teams." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/phan2021neurips-vast/)

BibTeX

@inproceedings{phan2021neurips-vast,
  title     = {{VAST: Value Function Factorization with Variable Agent Sub-Teams}},
  author    = {Phan, Thomy and Ritz, Fabian and Belzner, Lenz and Altmann, Philipp and Gabor, Thomas and Linnhoff-Popien, Claudia},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/phan2021neurips-vast/}
}