Asymmetric Certified Robustness via Feature-Convex Neural Networks

Abstract

Real-world adversarial attacks on machine learning models often feature an asymmetric structure wherein adversaries only attempt to induce false negatives (e.g., classify a spam email as not spam). We formalize the asymmetric robustness certification problem and correspondingly present the feature-convex neural network architecture, which composes an input-convex neural network (ICNN) with a Lipschitz continuous feature map in order to achieve asymmetric adversarial robustness. We consider the aforementioned binary setting with one "sensitive" class, and for this class we prove deterministic, closed-form, and easily-computable certified robust radii for arbitrary $\ell_p$-norms. We theoretically justify the use of these models by characterizing their decision region geometry, extending the universal approximation theorem for ICNN regression to the classification setting, and proving a lower bound on the probability that such models perfectly fit even unstructured uniformly distributed data in sufficiently high dimensions. Experiments on Malimg malware classification and subsets of the MNIST, Fashion-MNIST, and CIFAR-10 datasets show that feature-convex classifiers attain substantial certified $\ell_1$, $\ell_2$, and $\ell_{\infty}$-radii while being far more computationally efficient than competitive baselines.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Pfrommer et al. "Asymmetric Certified Robustness via Feature-Convex Neural Networks." Neural Information Processing Systems, 2023.

Markdown

[Pfrommer et al. "Asymmetric Certified Robustness via Feature-Convex Neural Networks." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/pfrommer2023neurips-asymmetric/)

BibTeX

@inproceedings{pfrommer2023neurips-asymmetric,
  title     = {{Asymmetric Certified Robustness via Feature-Convex Neural Networks}},
  author    = {Pfrommer, Samuel and Anderson, Brendon and Piet, Julien and Sojoudi, Somayeh},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/pfrommer2023neurips-asymmetric/}
}