One-Pixel Signature: Characterizing CNN Models for Backdoor Detection

Abstract

We tackle the convolution neural networks (CNNs) backdoor detection problem by proposing a new representation called one-pixel signature. Our task is to detect/classify if a CNN model has been maliciously inserted with an unknown Trojan trigger or not. Here, each CNN model is associated with a signature that is created by generating, pixel-by-pixel, an adversarial value that is the result of the largest change to the class prediction. The one-pixel signature is agnostic to the design choice of CNN architectures, and how they were trained. It can be computed efficiently for a black-box CNN model without accessing the network parameters. Our proposed one-pixel signature demonstrates a substantial improvement (by around $30\%$ in the absolute detection accuracy) over the existing competing methods for backdoored CNN detection/classification. One-pixel signature is a general representation that can be used to characterize CNN models beyond backdoor detection.

Cite

Text

Huang et al. "One-Pixel Signature: Characterizing CNN Models for Backdoor Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58583-9_20

Markdown

[Huang et al. "One-Pixel Signature: Characterizing CNN Models for Backdoor Detection." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/huang2020eccv-onepixel/) doi:10.1007/978-3-030-58583-9_20

BibTeX

@inproceedings{huang2020eccv-onepixel,
  title     = {{One-Pixel Signature: Characterizing CNN Models for Backdoor Detection}},
  author    = {Huang, Shanjiaoyang and Peng, Weiqi and Jia, Zhiwei and Tu, Zhuowen},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58583-9_20},
  url       = {https://mlanthology.org/eccv/2020/huang2020eccv-onepixel/}
}