Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning
Abstract
Fine-tuning large vision-language models is a challenging task. Prompt tuning approaches have been introduced to learn fixed textual or visual prompts while freezing the pre-trained model in downstream tasks. Despite the effectiveness of prompt tuning, what do those learnable prompts learn remains unexplained. In this work, we explore whether prompts in the fine-tuning can learn knowledge-aware prompts from the pre-training, by designing two different sets of prompts in pre-training and fine-tuning phases respectively. Specifically, we present a Video-Language Prompt tuning (VL-Prompt) approach for video captioning, which first efficiently pre-train a video-language model to extract key information (e.g., actions and objects) with flexibly generated Knowledge-Aware Prompt (KAP). Then, we design a Video-Language Prompt (VLP) to transfer the knowledge from the knowledge-aware prompts and fine-tune the model to generate full captions. Experimental results show the superior performance of our approach over several state-of-the-art baselines. We further demonstrate that the video-language prompts are well learned from the knowledge-aware prompts.
Cite
Text
Yan et al. "Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/180Markdown
[Yan et al. "Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/yan2023ijcai-prompt/) doi:10.24963/IJCAI.2023/180BibTeX
@inproceedings{yan2023ijcai-prompt,
title = {{Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration for Video Captioning}},
author = {Yan, Liqi and Han, Cheng and Xu, Zenglin and Liu, Dongfang and Wang, Qifan},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2023},
pages = {1622-1630},
doi = {10.24963/IJCAI.2023/180},
url = {https://mlanthology.org/ijcai/2023/yan2023ijcai-prompt/}
}