Policy-Driven Attack: Learning to Query for Hard-Label Black-Box Adversarial Examples

Abstract

To craft black-box adversarial examples, adversaries need to query the victim model and take proper advantage of its feedback. Existing black-box attacks generally suffer from high query complexity, especially when only the top-1 decision (i.e., the hard-label prediction) of the victim model is available. In this paper, we propose a novel hard-label black-box attack named Policy-Driven Attack, to reduce the query complexity. Our core idea is to learn promising search directions of the adversarial examples using a well-designed policy network in a novel reinforcement learning formulation, in which the queries become more sensible. Experimental results demonstrate that our method can significantly reduce the query complexity in comparison with existing state-of-the-art hard-label black-box attacks on various image classification benchmark datasets. Code and models for reproducing our results are available at https://github.com/ZiangYan/pda.pytorch

Cite

Text

Yan et al. "Policy-Driven Attack: Learning to Query for Hard-Label Black-Box Adversarial Examples." International Conference on Learning Representations, 2021.

Markdown

[Yan et al. "Policy-Driven Attack: Learning to Query for Hard-Label Black-Box Adversarial Examples." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/yan2021iclr-policydriven/)

BibTeX

@inproceedings{yan2021iclr-policydriven,
  title     = {{Policy-Driven Attack: Learning to Query for Hard-Label Black-Box Adversarial Examples}},
  author    = {Yan, Ziang and Guo, Yiwen and Liang, Jian and Zhang, Changshui},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/yan2021iclr-policydriven/}
}