Character as Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models

Abstract

In this paper, we study a controllable prompt adversarial attacking problem for text guided image generation (Text2Image) models in the black-box scenario, where the goal is to attack specific visual subjects (e.g., changing a brown dog to white) in a generated image by slightly, if not imperceptibly, perturbing the characters of the driven prompt (e.g., ``brown'' to ``br0wn''). Our study is motivated by the limitations of current Text2Image attacking approaches that still rely on manual trials to create adversarial prompts. To address such limitations, we develop CharGrad, a character-level gradient based attacking framework that replaces specific characters of a prompt with pixel-level similar ones by interactively learning the perturbation direction for the prompt and updating the attacking examiner for the generated image based on a novel proxy perturbation representation for characters. We evaluate CharGrad using the texts from two public image captioning datasets. Results demonstrate that CharGrad outperforms existing text adversarial attacking approaches on attacking various subjects of generated images by black-box Text2Image models in a more effective and efficient way with less perturbation on the characters of the prompts.

Cite

Text

Kou et al. "Character as Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/109

Markdown

[Kou et al. "Character as Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/kou2023ijcai-character/) doi:10.24963/IJCAI.2023/109

BibTeX

@inproceedings{kou2023ijcai-character,
  title     = {{Character as Pixels: A Controllable Prompt Adversarial Attacking Framework for Black-Box Text Guided Image Generation Models}},
  author    = {Kou, Ziyi and Pei, Shichao and Tian, Yijun and Zhang, Xiangliang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {983-990},
  doi       = {10.24963/IJCAI.2023/109},
  url       = {https://mlanthology.org/ijcai/2023/kou2023ijcai-character/}
}