Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling Wang, Michael I. Jordan

JMLR 2020 pp. 1-36

/jmlr/2020/yang2020jmlr-greedy/

Abstract

We present a probabilistic framework for studying adversarial attacks on discrete data. Based on this framework, we derive a perturbation-based method, Greedy Attack, and a scalable learning-based method, Gumbel Attack, that illustrate various tradeoffs in the design of attacks. We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various state-of-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As an example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.

PDF JMLR Semantic Scholar

Cite

Text

Yang et al. "Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data." Journal of Machine Learning Research, 2020.

Markdown

[Yang et al. "Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data." Journal of Machine Learning Research, 2020.](https://mlanthology.org/jmlr/2020/yang2020jmlr-greedy/)

BibTeX

@article{yang2020jmlr-greedy,
  title     = {{Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data}},
  author    = {Yang, Puyudi and Chen, Jianbo and Hsieh, Cho-Jui and Wang, Jane-Ling and Jordan, Michael I.},
  journal   = {Journal of Machine Learning Research},
  year      = {2020},
  pages     = {1-36},
  volume    = {21},
  url       = {https://mlanthology.org/jmlr/2020/yang2020jmlr-greedy/}
}