Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples

Abstract

We present DeClaW, a system for detecting, classifying, and warning of adversarial inputs presented to a classification neural network. In contrast to current state-of-the-art methods that, given an input, detect whether an input is clean or adversarial, we aim to also identify the types of adversarial attack (e.g., PGD, Carlini-Wagner or clean). To achieve this, we extract statistical profiles, which we term as anomaly feature vectors, from a set of latent features. Preliminary findings suggest that AFVs can help distinguish among several types of adversarial attacks (e.g., PGD versus Carlini-Wagner) with close to 93% accuracy on the CIFAR-10 dataset. The results open the door to using AFV-based methods for exploring not only adversarial attack detection but also classification of the attack type and then design of attack-specific mitigation strategies.

Cite

Text

Manohar-Alers et al. "Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples." ICML 2021 Workshops: AML, 2021.

Markdown

[Manohar-Alers et al. "Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples." ICML 2021 Workshops: AML, 2021.](https://mlanthology.org/icmlw/2021/manoharalers2021icmlw-using/)

BibTeX

@inproceedings{manoharalers2021icmlw-using,
  title     = {{Using Anomaly Feature Vectors for Detecting, Classifying and Warning of Outlier Adversarial Examples}},
  author    = {Manohar-Alers, Nelson and Feng, Ryan and Singh, Sahib and Song, Jiguo and Prakash, Atul},
  booktitle = {ICML 2021 Workshops: AML},
  year      = {2021},
  url       = {https://mlanthology.org/icmlw/2021/manoharalers2021icmlw-using/}
}