MMA-Diffusion: MultiModal Attack on Diffusion Models

Abstract

In recent years Text-to-Image (T2I) models have seen remarkable advancements gaining widespread adoption. However this progress has inadvertently opened avenues for potential misuse particularly in generating inappropriate or Not-Safe-For-Work (NSFW) content. Our work introduces MMA-Diffusion a framework that presents a significant and realistic threat to the security of T2I models by effectively circumventing current defensive measures in both open-source models and commercial online services. Unlike previous approaches MMA-Diffusion leverages both textual and visual modalities to bypass safeguards like prompt filters and post-hoc safety checkers thus exposing and highlighting the vulnerabilities in existing defense mechanisms. Our codes are available at https://github.com/cure-lab/MMA-Diffusion.

Cite

Text

Yang et al. "MMA-Diffusion: MultiModal Attack on Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00739

Markdown

[Yang et al. "MMA-Diffusion: MultiModal Attack on Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/yang2024cvpr-mmadiffusion/) doi:10.1109/CVPR52733.2024.00739

BibTeX

@inproceedings{yang2024cvpr-mmadiffusion,
  title     = {{MMA-Diffusion: MultiModal Attack on Diffusion Models}},
  author    = {Yang, Yijun and Gao, Ruiyuan and Wang, Xiaosen and Ho, Tsung-Yi and Xu, Nan and Xu, Qiang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {7737-7746},
  doi       = {10.1109/CVPR52733.2024.00739},
  url       = {https://mlanthology.org/cvpr/2024/yang2024cvpr-mmadiffusion/}
}