Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Abstract

Adversarial patches pose a realistic threat model for physical world attacks on autonomous systems via their perception component. Autonomous systems in safety-critical domains such as automated driving should thus contain a fail-safe fallback component that combines certifiable robustness against patches with efficient inference while maintaining high performance on clean inputs. We propose BagCert, a novel combination of model architecture and certification procedure that allows efficient certification. We derive a loss that enables end-to-end optimization of certified robustness against patches of different sizes and locations. On CIFAR10, BagCert certifies 10.000 examples in 43 seconds on a single GPU and obtains 86% clean and 60% certified accuracy against 5x5 patches.

Cite

Text

Metzen and Yatsura. "Efficient Certified Defenses Against Patch Attacks on Image Classifiers." International Conference on Learning Representations, 2021.

Markdown

[Metzen and Yatsura. "Efficient Certified Defenses Against Patch Attacks on Image Classifiers." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/metzen2021iclr-efficient/)

BibTeX

@inproceedings{metzen2021iclr-efficient,
  title     = {{Efficient Certified Defenses Against Patch Attacks on Image Classifiers}},
  author    = {Metzen, Jan Hendrik and Yatsura, Maksym},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/metzen2021iclr-efficient/}
}