Graph Prompts: Adapting Video Graph for Video Question Answering

Li, Yiming; Yang, Xiaoshan; Bao, Bing-Kun; Xu, Changsheng

doi:10.24963/IJCAI.2025/166

Graph Prompts: Adapting Video Graph for Video Question Answering

Yiming Li, Xiaoshan Yang, Bing-Kun Bao, Changsheng Xu

IJCAI 2025 pp. 1485-1493

doi:10.24963/IJCAI.2025/166 /ijcai/2025/li2025ijcai-graph/

Abstract

Due to the dynamic nature in videos, it is evident that perceiving and reasoning about temporal information are the key focus of Video Question Answering (VideoQA). In recent years, several methods have explored relationship-level temporal modeling with graph-structured video representation. Unfortunately, these methods heavily rely on the question text, thus making it challenging to perceive and reason about video content that is not explicitly mentioned in the question. To address the above challenge, we propose Graph Prompts-based VideoQA (GP-VQA), which adopts a video-based graph structure for enhanced video understanding. The proposed GP-VQA contains two stages, i.e., pre-training and prompt tuning. In pre-training, we define the pretext task that requires GP-VQA to reason about the randomly masked nodes or edges in the video graph, thus prompting GP-VQA to learn the reasoning ability with video-guided information. In prompt-tuning, we organize the textual question into question graph and implement message passing from video graph to question graph, therefore inheriting the video-based reasoning ability from video graph completion to VideoQA. Extensive experiments on various datasets have demonstrated the promising performance of GP-VQA.

PDF IJCAI Semantic Scholar

Cite

Text

Li et al. "Graph Prompts: Adapting Video Graph for Video Question Answering." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/166

Markdown

[Li et al. "Graph Prompts: Adapting Video Graph for Video Question Answering." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/li2025ijcai-graph/) doi:10.24963/IJCAI.2025/166

BibTeX

@inproceedings{li2025ijcai-graph,
  title     = {{Graph Prompts: Adapting Video Graph for Video Question Answering}},
  author    = {Li, Yiming and Yang, Xiaoshan and Bao, Bing-Kun and Xu, Changsheng},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {1485-1493},
  doi       = {10.24963/IJCAI.2025/166},
  url       = {https://mlanthology.org/ijcai/2025/li2025ijcai-graph/}
}