Interpretable Adversarial Perturbation in Input Embedding Space for Text

Sato, Motoki; Suzuki, Jun; Shindo, Hiroyuki; Matsumoto, Yuji

doi:10.24963/IJCAI.2018/601

Interpretable Adversarial Perturbation in Input Embedding Space for Text

Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto

IJCAI 2018 pp. 4323-4330

doi:10.24963/IJCAI.2018/601 /ijcai/2018/sato2018ijcai-interpretable/

Abstract

Following great success in the image processing field, the idea of adversarial training has been applied to tasks in the natural language processing (NLP) field. One promising approach directly applies adversarial training developed in the image processing field to the input word embedding space instead of the discrete input space of texts. However, this approach abandons such interpretability as generating adversarial texts to significantly improve the performance of NLP tasks. This paper restores interpretability to such methods by restricting the directions of perturbations toward the existing words in the input embedding space. As a result, we can straightforwardly reconstruct each input with perturbations to an actual text by considering the perturbations to be the replacement of words in the sentence while maintaining or even improving the task performance.

PDF IJCAI Semantic Scholar

Cite

Text

Sato et al. "Interpretable Adversarial Perturbation in Input Embedding Space for Text." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/601

Markdown

[Sato et al. "Interpretable Adversarial Perturbation in Input Embedding Space for Text." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/sato2018ijcai-interpretable/) doi:10.24963/IJCAI.2018/601

BibTeX

@inproceedings{sato2018ijcai-interpretable,
  title     = {{Interpretable Adversarial Perturbation in Input Embedding Space for Text}},
  author    = {Sato, Motoki and Suzuki, Jun and Shindo, Hiroyuki and Matsumoto, Yuji},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {4323-4330},
  doi       = {10.24963/IJCAI.2018/601},
  url       = {https://mlanthology.org/ijcai/2018/sato2018ijcai-interpretable/}
}