Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations

Abstract

In this paper, we propose a novel and practical mechanism which enables the service provider to verify whether a suspect model is stolen from the victim model via model extraction attacks. Our key insight is that the profile of a DNN model's decision boundary can be uniquely characterized by its Universal Adversarial Perturbations (UAPs). UAPs belong to a low-dimensional subspace and piracy models' subspaces are more consistent with victim model's subspace compared with non-piracy model. Based on this, we propose a UAP fingerprinting method for DNN models and train an encoder via contrastive learning that takes fingerprint as inputs, outputs a similarity score. Extensive studies show that our framework can detect model IP breaches with confidence > 99.99% within only 20 fingerprints of the suspect model. It has good generalizability across different model architectures and is robust against post-modifications on stolen models.

Cite

Text

Peng et al. "Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01307

Markdown

[Peng et al. "Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/peng2022cvpr-fingerprinting/) doi:10.1109/CVPR52688.2022.01307

BibTeX

@inproceedings{peng2022cvpr-fingerprinting,
  title     = {{Fingerprinting Deep Neural Networks Globally via Universal Adversarial Perturbations}},
  author    = {Peng, Zirui and Li, Shaofeng and Chen, Guoxing and Zhang, Cheng and Zhu, Haojin and Xue, Minhui},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {13430-13439},
  doi       = {10.1109/CVPR52688.2022.01307},
  url       = {https://mlanthology.org/cvpr/2022/peng2022cvpr-fingerprinting/}
}