What Do We Learn from Inverting CLIP Models?

Abstract

We employ an inversion-based approach to examine CLIP models. Our examination reveals that inverting CLIP models results in the generation of images that exhibit semantic alignment with the specified target prompts. We leverage these inverted images to gain insights into various aspects of CLIP models, such as their ability to blend concepts and inclusion of gender biases. We notably observe instances of NSFW (Not Safe For Work) images during model inversion. This phenomenon occurs even for semantically innocuous prompts, like `a beautiful landscape,' as well as for prompts involving the names of celebrities.

Cite

Text

Kazemi et al. "What Do We Learn from Inverting CLIP Models?." NeurIPS 2024 Workshops: SafeGenAi, 2024.

Markdown

[Kazemi et al. "What Do We Learn from Inverting CLIP Models?." NeurIPS 2024 Workshops: SafeGenAi, 2024.](https://mlanthology.org/neuripsw/2024/kazemi2024neuripsw-we/)

BibTeX

@inproceedings{kazemi2024neuripsw-we,
  title     = {{What Do We Learn from Inverting CLIP Models?}},
  author    = {Kazemi, Hamid and Chegini, Atoosa and Geiping, Jonas and Feizi, Soheil and Goldstein, Tom},
  booktitle = {NeurIPS 2024 Workshops: SafeGenAi},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/kazemi2024neuripsw-we/}
}