Interpreting and Evaluating Neural Network Robustness

Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen

IJCAI 2019 pp. 4199-4205

doi:10.24963/IJCAI.2019/583 /ijcai/2019/yu2019ijcai-interpreting/

Abstract

Recently, adversarial deception becomes one of the most considerable threats to deep neural networks. However, compared to extensive research in new designs of various adversarial attacks and defenses, the neural networks' intrinsic robustness property is still lack of thorough investigation. This work aims to qualitatively interpret the adversarial attack and defense mechanisms through loss visualization, and establish a quantitative metric to evaluate the model's intrinsic robustness. The proposed robustness metric identifies the upper bound of a model's prediction divergence in the given domain and thus indicates whether the model can maintain a stable prediction. With extensive experiments, our metric demonstrates several advantages over conventional testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it over-performs conventional accuracy based robustness evaluation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.

PDF IJCAI Semantic Scholar

Cite

Text

Yu et al. "Interpreting and Evaluating Neural Network Robustness." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/583

Markdown

[Yu et al. "Interpreting and Evaluating Neural Network Robustness." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/yu2019ijcai-interpreting/) doi:10.24963/IJCAI.2019/583

BibTeX

@inproceedings{yu2019ijcai-interpreting,
  title     = {{Interpreting and Evaluating Neural Network Robustness}},
  author    = {Yu, Fuxun and Qin, Zhuwei and Liu, Chenchen and Zhao, Liang and Wang, Yanzhi and Chen, Xiang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {4199-4205},
  doi       = {10.24963/IJCAI.2019/583},
  url       = {https://mlanthology.org/ijcai/2019/yu2019ijcai-interpreting/}
}