Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun

IJCAI 2017 pp. 3756-3762

doi:10.24963/IJCAI.2017/525 /ijcai/2017/lin2017ijcai-tactics/

Abstract

We introduce two tactics, namely the strategically-timed attack and the enchanting attack, to attack reinforcement learning agents trained by deep reinforcement learning algorithms using adversarial examples. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the proposed tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically-timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Example videos are available at http://yclin.me/adversarial_attack_RL/.

PDF IJCAI Semantic Scholar

Cite

Text

Lin et al. "Tactics of Adversarial Attack on Deep Reinforcement Learning Agents." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/525

Markdown

[Lin et al. "Tactics of Adversarial Attack on Deep Reinforcement Learning Agents." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/lin2017ijcai-tactics/) doi:10.24963/IJCAI.2017/525

BibTeX

@inproceedings{lin2017ijcai-tactics,
  title     = {{Tactics of Adversarial Attack on Deep Reinforcement Learning Agents}},
  author    = {Lin, Yen-Chen and Hong, Zhang-Wei and Liao, Yuan-Hong and Shih, Meng-Li and Liu, Ming-Yu and Sun, Min},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {3756-3762},
  doi       = {10.24963/IJCAI.2017/525},
  url       = {https://mlanthology.org/ijcai/2017/lin2017ijcai-tactics/}
}