Adversarial Policy Learning in Two-Player Competitive Games

Abstract

In a two-player deep reinforcement learning task, recent work shows an attacker could learn an adversarial policy that triggers a target agent to perform poorly and even react in an undesired way. However, its efficacy heavily relies upon the zero-sum assumption made in the two-player game. In this work, we propose a new adversarial learning algorithm. It addresses the problem by resetting the optimization goal in the learning process and designing a new surrogate optimization function. Our experiments show that our method significantly improves adversarial agents’ exploitability compared with the state-of-art attack. Besides, we also discover that our method could augment an agent with the ability to abuse the target game’s unfairness. Finally, we show that agents adversarially re-trained against our adversarial agents could obtain stronger adversary-resistance.

Cite

Text

Guo et al. "Adversarial Policy Learning in Two-Player Competitive Games." International Conference on Machine Learning, 2021.

Markdown

[Guo et al. "Adversarial Policy Learning in Two-Player Competitive Games." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/guo2021icml-adversarial/)

BibTeX

@inproceedings{guo2021icml-adversarial,
  title     = {{Adversarial Policy Learning in Two-Player Competitive Games}},
  author    = {Guo, Wenbo and Wu, Xian and Huang, Sui and Xing, Xinyu},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {3910-3919},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/guo2021icml-adversarial/}
}