Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering

Abstract

Recently, video question answering has attracted growing attention. It involves answering a question based on a fine-grained understanding of video multi-modal information. Most existing methods have successfully explored the deep understanding of visual modality. We argue that a deep understanding of linguistic modality is also essential for answer reasoning, especially for videos that contain character dialogues. To this end, we propose an Inferential Knowledge-Enhanced Integrated Reasoning method. Our method consists of two main components: 1) an Inferential Knowledge Reasoner to generate inferential knowledge for linguistic modality inputs that reveals deeper semantics, including the implicit causes, effects, mental states, etc. 2) an Integrated Reasoning Mechanism to enhance video content understanding and answer reasoning by leveraging the generated inferential knowledge. Experimental results show that our method achieves significant improvement on two mainstream datasets. The ablation study further demonstrates the effectiveness of each component of our approach.

Cite

Text

Mao et al. "Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I11.26570

Markdown

[Mao et al. "Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/mao2023aaai-inferential/) doi:10.1609/AAAI.V37I11.26570

BibTeX

@inproceedings{mao2023aaai-inferential,
  title     = {{Inferential Knowledge-Enhanced Integrated Reasoning for Video Question Answering}},
  author    = {Mao, Jianguo and Jiang, Wenbin and Liu, Hong and Wang, Xiangdong and Lyu, Yajuan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {13380-13388},
  doi       = {10.1609/AAAI.V37I11.26570},
  url       = {https://mlanthology.org/aaai/2023/mao2023aaai-inferential/}
}