Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform

Abstract

We give a new algorithm for approximating the Discrete Fourier transform of an approximately sparse signal that is robust to worst-case $L_0$ corruptions, namely that some coordinates of the signal can be corrupt arbitrarily. Our techniques generalize to a wide range of linear transformations that are used in data analysis such as the Discrete Cosine and Sine transforms, the Hadamard transform, and their high-dimensional analogs. We use our algorithm to successfully defend against worst-case $L_0$ adversaries in the setting of image classification. We give experimental results on the Jacobian-based Saliency Map Attack (JSMA) and the CW $L_0$ attack on the MNIST and Fashion-MNIST datasets as well as the Adversarial Patch on the ImageNet dataset.

Cite

Text

Bafna et al. "Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform." Neural Information Processing Systems, 2018.

Markdown

[Bafna et al. "Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform." Neural Information Processing Systems, 2018.](https://mlanthology.org/neurips/2018/bafna2018neurips-thwarting/)

BibTeX

@inproceedings{bafna2018neurips-thwarting,
  title     = {{Thwarting Adversarial Examples: An $l_0$-Robust Sparse Fourier Transform}},
  author    = {Bafna, Mitali and Murtagh, Jack and Vyas, Nikhil},
  booktitle = {Neural Information Processing Systems},
  year      = {2018},
  pages     = {10075-10085},
  url       = {https://mlanthology.org/neurips/2018/bafna2018neurips-thwarting/}
}