StarCraft II Unplugged: Large Scale Offline Reinforcement Learning

Abstract

StarCraft II is one of the most challenging reinforcement learning (RL) environments; it is partially observable, stochastic, and multi-agent, and mastering StarCraft II requires strategic planning over long-time horizons with real-time low-level execution. It also has an active human competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of its challenging nature and because a massive dataset of millions of StarCraft II games played by human players has been released by Blizzard. This paper leverages that and establishes a benchmark, which we call StarCraft II Unplugged, that introduces unprecedented challenges for offline reinforcement learning. We define a dataset (a subset of Blizzard’s release), tools standardising an API for ML methods, and an evaluation protocol. We also present baseline agents, including behaviour cloning, and offline variants of V-trace actor-critic and MuZero. We find that the variants of those algorithms with behaviour value estimation and single step policy improvement work best and exceed 90% win rate against previously published AlphaStar behaviour cloning agents.

Cite

Text

Mathieu et al. "StarCraft II Unplugged: Large Scale Offline Reinforcement Learning." NeurIPS 2021 Workshops: DeepRL, 2021.

Markdown

[Mathieu et al. "StarCraft II Unplugged: Large Scale Offline Reinforcement Learning." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/mathieu2021neuripsw-starcraft/)

BibTeX

@inproceedings{mathieu2021neuripsw-starcraft,
  title     = {{StarCraft II Unplugged: Large Scale Offline Reinforcement Learning}},
  author    = {Mathieu, Michael and Ozair, Sherjil and Srinivasan, Srivatsan and Gulcehre, Caglar and Zhang, Shangtong and Jiang, Ray and Le Paine, Tom and Zolna, Konrad and Powell, Richard and Schrittwieser, Julian and Choi, David and Georgiev, Petko and Toyama, Daniel Kenji and Huang, Aja and Ring, Roman and Babuschkin, Igor and Ewalds, Timo and Bordbar, Mahyar and Henderson, Sarah and Colmenarejo, Sergio Gómez and van den Oord, Aaron and Czarnecki, Wojciech M. and de Freitas, Nando and Vinyals, Oriol},
  booktitle = {NeurIPS 2021 Workshops: DeepRL},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/mathieu2021neuripsw-starcraft/}
}