Fine-Grained Prompt Screening: Defending Against Backdoor Attack on Text-to-Image Diffusion Models
Abstract
Text-to-image (T2I) diffusion models exhibit impressive generation capabilities in recently studies. However, they are vulnerable to backdoor attacks, where model outputs are manipulated by malicious triggers. In this paper, we propose a novel input-level defense method, called Fine-grained Prompt Screening (GrainPS). Our method is motivated by the phenomenon, i.e., Semantics Misalignment, where the backdoor trigger causes the inconsistency between the cross-attention projections of object words (the key words to determine the main content of the generated image) and their true semantics. In particular, we divide each prompt into pieces and conduct fine-grained analysis by examining the impact of the trigger on object words in the cross-attention layers rather than their global influence on the entire generated image. To assess the impact of each word on object words, we formulate "semantics alignment score'' as the metric with a carefully crafted detection strategy to identify the trigger. Therefore, our implementation can detect backdoor input prompts and localize of triggers simultaneously. Evaluations across four advanced backdoor attack scenarios demonstrate the effectiveness of our proposed defense method.
Cite
Text
Xu et al. "Fine-Grained Prompt Screening: Defending Against Backdoor Attack on Text-to-Image Diffusion Models." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/68Markdown
[Xu et al. "Fine-Grained Prompt Screening: Defending Against Backdoor Attack on Text-to-Image Diffusion Models." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/xu2025ijcai-fine/) doi:10.24963/IJCAI.2025/68BibTeX
@inproceedings{xu2025ijcai-fine,
title = {{Fine-Grained Prompt Screening: Defending Against Backdoor Attack on Text-to-Image Diffusion Models}},
author = {Xu, Yiran and Zhong, Nan and Li, Guobiao and Cheng, Anda and Wang, Yinggui and Qian, Zhenxing and Zhang, Xinpeng},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {601-609},
doi = {10.24963/IJCAI.2025/68},
url = {https://mlanthology.org/ijcai/2025/xu2025ijcai-fine/}
}