Scalable Backdoor Detection in Neural Networks

Abstract

Recently, it has been shown that deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch. Current backdoor detection methods fail to achieve good detection performance and are computationally expensive. In this paper, we propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types. In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.

Cite

Text

Harikumar et al. "Scalable Backdoor Detection in Neural Networks." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020. doi:10.1007/978-3-030-67661-2_18

Markdown

[Harikumar et al. "Scalable Backdoor Detection in Neural Networks." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2020.](https://mlanthology.org/ecmlpkdd/2020/harikumar2020ecmlpkdd-scalable/) doi:10.1007/978-3-030-67661-2_18

BibTeX

@inproceedings{harikumar2020ecmlpkdd-scalable,
  title     = {{Scalable Backdoor Detection in Neural Networks}},
  author    = {Harikumar, Haripriya and Le, Vuong and Rana, Santu and Bhattacharya, Sourangshu and Gupta, Sunil and Venkatesh, Svetha},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2020},
  pages     = {289-304},
  doi       = {10.1007/978-3-030-67661-2_18},
  url       = {https://mlanthology.org/ecmlpkdd/2020/harikumar2020ecmlpkdd-scalable/}
}