Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics

Abstract

Measuring perceptual similarity is a key tool in computer vision. In recent years perceptual metrics based on features extracted from neural networks with large and diverse training sets, e.g. CLIP, have become popular. At the same time, the metrics extracted from features of neural networks are not adversarially robust. In this paper we show that adversarially robust CLIP models induce *better* and *adversarially robust* perceptual metrics that outperform existing metrics in a zero-shot setting, and further match the performance of state-of-the-art metrics while being robust after fine-tuning. Notably, these perceptual metrics enable adversarially robust NSFW content detection. Finally, the perceptual metrics induced by robust CLIP models have higher interpretability: feature inversion can show which images are considered similar, while text inversion can find what images are associated to a given prompt. This also allows us to visualize the very rich visual concepts learned by a CLIP model, including memorized persons, paintings and complex queries.

Cite

Text

Croce et al. "Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics." ICML 2024 Workshops: FM-Wild, 2024.

Markdown

[Croce et al. "Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics." ICML 2024 Workshops: FM-Wild, 2024.](https://mlanthology.org/icmlw/2024/croce2024icmlw-adversarially/)

BibTeX

@inproceedings{croce2024icmlw-adversarially,
  title     = {{Adversarially Robust CLIP Models Induce Better (Robust) Perceptual Metrics}},
  author    = {Croce, Francesco and Schlarmann, Christian and Singh, Naman Deep and Hein, Matthias},
  booktitle = {ICML 2024 Workshops: FM-Wild},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/croce2024icmlw-adversarially/}
}