SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

Abstract

A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.

Cite

Text

Cheng et al. "SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation." International Conference on Machine Learning, 2021.

Markdown

[Cheng et al. "SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/cheng2021icml-spade/)

BibTeX

@inproceedings{cheng2021icml-spade,
  title     = {{SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation}},
  author    = {Cheng, Wuxinlin and Deng, Chenhui and Zhao, Zhiqiang and Cai, Yaohui and Zhang, Zhiru and Feng, Zhuo},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {1814-1824},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/cheng2021icml-spade/}
}