ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-Agent Zero-Shot Coordination

Abstract

Zero-shot coordination (ZSC) is a new cooperative multi-agent reinforcement learning (MARL) challenge that aims to train an ego agent to work with diverse, unseen partners during deployment. The significant difference between the deployment-time partners' distribution and the training partners' distribution determined by the training algorithm makes ZSC a unique out-of-distribution (OOD) generalization challenge. The potential distribution gap between evaluation and deployment-time partners leads to inadequate evaluation, which is exacerbated by the lack of appropriate evaluation metrics. In this paper, we present ZSC-Eval, the first evaluation toolkit and benchmark for ZSC algorithms. ZSC-Eval consists of: 1) Generation of evaluation partner candidates through behavior-preferring rewards to approximate deployment-time partners' distribution; 2) Selection of evaluation partners by Best-Response Diversity (BR-Div); 3) Measurement of generalization performance with various evaluation partners via the Best-Response Proximity (BR-Prox) metric. We use ZSC-Eval to benchmark ZSC algorithms in Overcooked and Google Research Football environments and get novel empirical findings. We also conduct a human experiment of current ZSC algorithms to verify the ZSC-Eval's consistency with human evaluation. ZSC-Eval is now available at https://github.com/sjtu-marl/ZSC-Eval.

Cite

Text

Wang et al. "ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-Agent Zero-Shot Coordination." Neural Information Processing Systems, 2024. doi:10.52202/079017-1501

Markdown

[Wang et al. "ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-Agent Zero-Shot Coordination." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/wang2024neurips-zsceval/) doi:10.52202/079017-1501

BibTeX

@inproceedings{wang2024neurips-zsceval,
  title     = {{ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-Agent Zero-Shot Coordination}},
  author    = {Wang, Xihuai and Zhang, Shao and Zhang, Wenhao and Dong, Wentao and Chen, Jingxiao and Wen, Ying and Zhang, Weinan},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-1501},
  url       = {https://mlanthology.org/neurips/2024/wang2024neurips-zsceval/}
}