VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-Trained Models

Yin, Ziyi; Ye, Muchao; Zhang, Tianrong; Wang, Jiaqi; Liu, Han; Chen, Jinghui; Wang, Ting; Ma, Fenglong

doi:10.1609/AAAI.V38I7.28499

VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-Trained Models

Ziyi Yin, Muchao Ye, Tianrong Zhang, Jiaqi Wang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma

AAAI 2024 pp. 6755-6763

doi:10.1609/AAAI.V38I7.28499 /aaai/2024/yin2024aaai-vqattack/

Abstract

Visual Question Answering (VQA) is a fundamental task in computer vision and natural language process fields. Although the “pre-training & finetuning” learning paradigm significantly improves the VQA performance, the adversarial robustness of such a learning paradigm has not been explored. In this paper, we delve into a new problem: using a pre-trained multimodal source model to create adversarial image-text pairs and then transferring them to attack the target VQA models. Correspondingly, we propose a novel VQATTACK model, which can iteratively generate both im- age and text perturbations with the designed modules: the large language model (LLM)-enhanced image attack and the cross-modal joint attack module. At each iteration, the LLM-enhanced image attack module first optimizes the latent representation-based loss to generate feature-level image perturbations. Then it incorporates an LLM to further enhance the image perturbations by optimizing the designed masked answer anti-recovery loss. The cross-modal joint attack module will be triggered at a specific iteration, which updates the image and text perturbations sequentially. Notably, the text perturbation updates are based on both the learned gradients in the word embedding space and word synonym-based substitution. Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQATTACK in the transferable attack setting, compared with state-of-the-art baselines. This work reveals a significant blind spot in the “pre-training & fine-tuning” paradigm on VQA tasks. The source code can be found in the link https://github.com/ericyinyzy/VQAttack.

PDF AAAI Semantic Scholar

Cite

Text

Yin et al. "VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-Trained Models." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I7.28499

Markdown

[Yin et al. "VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-Trained Models." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/yin2024aaai-vqattack/) doi:10.1609/AAAI.V38I7.28499

BibTeX

@inproceedings{yin2024aaai-vqattack,
  title     = {{VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-Trained Models}},
  author    = {Yin, Ziyi and Ye, Muchao and Zhang, Tianrong and Wang, Jiaqi and Liu, Han and Chen, Jinghui and Wang, Ting and Ma, Fenglong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {6755-6763},
  doi       = {10.1609/AAAI.V38I7.28499},
  url       = {https://mlanthology.org/aaai/2024/yin2024aaai-vqattack/}
}