WaNet - Imperceptible Warping-Based Backdoor Attack

Abstract

With the thriving of deep learning and the widespread practice of using pre-trained networks, backdoor attacks have become an increasing security threat drawing many research interests in recent years. A third-party model can be poisoned in training to work well in normal conditions but behave maliciously when a trigger pattern appears. However, the existing backdoor attacks are all built on noise perturbation triggers, making them noticeable to humans. In this paper, we instead propose using warping-based triggers. The proposed backdoor outperforms the previous methods in a human inspection test by a wide margin, proving its stealthiness. To make such models undetectable by machine defenders, we propose a novel training mode, called the ``noise mode. The trained networks successfully attack and bypass the state-ofthe art defense methods on standard classification datasets, including MNIST, CIFAR-10, GTSRB, and CelebA. Behavior analyses show that our backdoors are transparent to network inspection, further proving this novel attack mechanism's efficiency.

Cite

Text

Nguyen and Tran. "WaNet - Imperceptible Warping-Based Backdoor Attack." International Conference on Learning Representations, 2021.

Markdown

[Nguyen and Tran. "WaNet - Imperceptible Warping-Based Backdoor Attack." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/nguyen2021iclr-wanet/)

BibTeX

@inproceedings{nguyen2021iclr-wanet,
  title     = {{WaNet - Imperceptible Warping-Based Backdoor Attack}},
  author    = {Nguyen, Tuan Anh and Tran, Anh Tuan},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/nguyen2021iclr-wanet/}
}