Purifying Quantization-Conditioned Backdoors via Layer-Wise Activation Correction with Distribution Approximation

Boheng Li, Yishuo Cai, Jisong Cai, Yiming Li, Han Qiu, Run Wang, Tianwei Zhang

ICML 2024 pp. 27439-27456

/icml/2024/li2024icml-purifying/

Abstract

Model quantization is a compression technique that converts a full-precision model to a more compact low-precision version for better storage. Despite the great success of quantization, recent studies revealed the feasibility of malicious exploiting model quantization via implanting quantization-conditioned backdoors (QCBs). These special backdoors remain dormant in full-precision models but are exposed upon quantization. Unfortunately, existing defenses have limited effects on mitigating QCBs. In this paper, we conduct an in-depth analysis of QCBs. We reveal an intriguing characteristic of QCBs, where activation of backdoor-related neurons on even benign samples enjoy a distribution drift after quantization, although this drift is more significant on poisoned samples. Motivated by this finding, we propose to purify the backdoor-exposed quantized model by aligning its layer-wise activation with its full-precision version. To further exploit the more pronounced activation drifts on poisoned samples, we design an additional module to layer-wisely approximate poisoned activation distribution based on batch normalization statistics of the full-precision model. Extensive experiments are conducted, verifying the effectiveness of our defense. Our code is publicly available.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Li et al. "Purifying Quantization-Conditioned Backdoors via Layer-Wise Activation Correction with Distribution Approximation." International Conference on Machine Learning, 2024.

Markdown

[Li et al. "Purifying Quantization-Conditioned Backdoors via Layer-Wise Activation Correction with Distribution Approximation." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/li2024icml-purifying/)

BibTeX

@inproceedings{li2024icml-purifying,
  title     = {{Purifying Quantization-Conditioned Backdoors via Layer-Wise Activation Correction with Distribution Approximation}},
  author    = {Li, Boheng and Cai, Yishuo and Cai, Jisong and Li, Yiming and Qiu, Han and Wang, Run and Zhang, Tianwei},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {27439-27456},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/li2024icml-purifying/}
}