Distributionally Robust Cooperative Multi-Agent Reinforcement Learning with Value Factorization

Qu, Chengrui; Yeh, Christopher; Panaganti, Kishan; Mazumdar, Eric; Wierman, Adam

Distributionally Robust Cooperative Multi-Agent Reinforcement Learning with Value Factorization

Chengrui Qu, Christopher Yeh, Kishan Panaganti, Eric Mazumdar, Adam Wierman

ICLR 2026

/iclr/2026/qu2026iclr-distributionally/

Abstract

Cooperative multi-agent reinforcement learning (MARL) commonly adopts centralized training with decentralized execution, where value-factorization methods enforce the individual-global-maximum (IGM) principle so that decentralized greedy actions recover the team-optimal joint action. However, the reliability of this recipe in real-world settings remains uncertain due to environmental uncertainties arising from the sim-to-real gap, model mismatch, system noise. We address this gap by introducing Distributionally robust IGM (DrIGM), a principle that requires each agent's robust greedy action to align with the robust team-optimal joint action. We show that DrIGM holds for a novel definition of robust individual action values, which is compatible with decentralized greedy execution and yields a provable robustness guarantee for the whole system. Building on this foundation, we derive DrIGM-compliant robust variants of existing value-factorization architectures (e.g., VDN/QMIX/QTRAN) that (i) train on robust Q-targets, (ii) preserve scalability, and (iii) integrate seamlessly with existing codebases without bespoke per-agent reward shaping. Empirically, on high-fidelity SustainGym simulators and a StarCraft game environment, our methods consistently improve out-of-distribution performances.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Qu et al. "Distributionally Robust Cooperative Multi-Agent Reinforcement Learning with Value Factorization." International Conference on Learning Representations, 2026.

Markdown

[Qu et al. "Distributionally Robust Cooperative Multi-Agent Reinforcement Learning with Value Factorization." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/qu2026iclr-distributionally/)

BibTeX

@inproceedings{qu2026iclr-distributionally,
  title     = {{Distributionally Robust Cooperative Multi-Agent Reinforcement Learning with Value Factorization}},
  author    = {Qu, Chengrui and Yeh, Christopher and Panaganti, Kishan and Mazumdar, Eric and Wierman, Adam},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/qu2026iclr-distributionally/}
}