Theoretically Principled Trade-Off for Stateful Defenses Against Query-Based Black-Box Attacks
Abstract
Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by maintaining a buffer of recent queries and detecting new queries that are too similar. However, these defenses fundamentally pose a trade-off between attack detection and false positive rates, and this trade-off is typically optimized by hand-picking feature extractors and similarity thresholds that empirically work well. There is little current understanding as to the formal limits of this trade-off and the exact properties of the feature extractors/underlying problem domain that influence it. This work aims to address this gap by offering a theoretical characterization of the trade-off between detection and false positive rates for stateful defenses. We provide upper bounds for detection rates of a general class of feature extractors and analyze the impact of this trade-off on the convergence of black-box attacks. We then support our theoretical findings with empirical evaluations across multiple datasets and stateful defenses.
Cite
Text
Hooda et al. "Theoretically Principled Trade-Off for Stateful Defenses Against Query-Based Black-Box Attacks." ICML 2023 Workshops: AdvML-Frontiers, 2023.Markdown
[Hooda et al. "Theoretically Principled Trade-Off for Stateful Defenses Against Query-Based Black-Box Attacks." ICML 2023 Workshops: AdvML-Frontiers, 2023.](https://mlanthology.org/icmlw/2023/hooda2023icmlw-theoretically/)BibTeX
@inproceedings{hooda2023icmlw-theoretically,
title = {{Theoretically Principled Trade-Off for Stateful Defenses Against Query-Based Black-Box Attacks}},
author = {Hooda, Ashish and Mangaokar, Neal and Feng, Ryan and Fawaz, Kassem and Jha, Somesh and Prakash, Atul},
booktitle = {ICML 2023 Workshops: AdvML-Frontiers},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/hooda2023icmlw-theoretically/}
}