Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Lang, Jian; Cheng, Zhangtao; Zhong, Ting; Zhou, Fan

doi:10.1609/AAAI.V39I17.33984

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Jian Lang, Zhangtao Cheng, Ting Zhong, Fan Zhou

AAAI 2025 pp. 18035-18043

doi:10.1609/AAAI.V39I17.33984 /aaai/2025/lang2025aaai-retrieval/

Abstract

Multimodal learning with incomplete modality is practical and challenging. Recently, researchers have focused on enhancing the robustness of pre-trained MultiModal Transformers (MMTs) under missing modality conditions by applying learnable prompts. However, these prompt-based methods face several limitations: (1) incomplete modalities provide restricted modal cues for task-specific inference, (2) dummy imputation for missing content causes information loss and introduces noise, and (3) static prompts are instance-agnostic, offering limited knowledge for instances with various missing conditions. To address these issues, we propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework. RAGPT comprises three modules: (I) the multi-channel retriever, which identifies similar instances through a within-modality retrieval strategy, (II) the missing modality generator, which recovers missing information using retrieved contexts, and (III) the context-aware prompter, which captures contextual knowledge from relevant instances and generates dynamic prompts to largely enhance the MMT’s robustness. Extensive experiments conducted on three real-world datasets show that RAGPT consistently outperforms all competitive baselines in handling incomplete modality problems.

PDF AAAI Semantic Scholar

Cite

Text

Lang et al. "Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I17.33984

Markdown

[Lang et al. "Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/lang2025aaai-retrieval/) doi:10.1609/AAAI.V39I17.33984

BibTeX

@inproceedings{lang2025aaai-retrieval,
  title     = {{Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning}},
  author    = {Lang, Jian and Cheng, Zhangtao and Zhong, Ting and Zhou, Fan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {18035-18043},
  doi       = {10.1609/AAAI.V39I17.33984},
  url       = {https://mlanthology.org/aaai/2025/lang2025aaai-retrieval/}
}