Feature Augmented Memory with Global Attention Network for VideoQA

Cai, Jiayin; Yuan, Chun; Shi, Cheng; Li, Lei; Cheng, Yangyang; Shan, Ying

doi:10.24963/IJCAI.2020/139

Feature Augmented Memory with Global Attention Network for VideoQA

Jiayin Cai, Chun Yuan, Cheng Shi, Lei Li, Yangyang Cheng, Ying Shan

IJCAI 2020 pp. 998-1004

doi:10.24963/IJCAI.2020/139 /ijcai/2020/cai2020ijcai-feature/

Abstract

Recently, Recurrent Neural Network (RNN) based methods and Self-Attention (SA) based methods have achieved promising performance in Video Question Answering (VideoQA). Despite the success of these works, RNN-based methods tend to forget the global semantic contents due to the inherent drawbacks of the recurrent units themselves, while SA-based methods cannot precisely capture the dependencies of the local neighborhood, leading to insufficient modeling for temporal order. To tackle these problems, we propose a novel VideoQA framework which progressively refines the representations of videos and questions from fine to coarse grain in a sequence-sensitive manner. Specifically, our model improves the feature representations via the following two steps: (1) introducing two fine-grained feature-augmented memories to strengthen the information augmentation of video and text which can improve memory capacity by memorizing more relevant and targeted information. (2) appending the self-attention and co-attention module to the memory output thus the module is able to capture global interaction between high-level semantic informations. Experimental results show that our approach achieves state-of-the-art performance on VideoQA benchmark datasets.

PDF IJCAI Semantic Scholar

Cite

Text

Cai et al. "Feature Augmented Memory with Global Attention Network for VideoQA." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/139

Markdown

[Cai et al. "Feature Augmented Memory with Global Attention Network for VideoQA." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/cai2020ijcai-feature/) doi:10.24963/IJCAI.2020/139

BibTeX

@inproceedings{cai2020ijcai-feature,
  title     = {{Feature Augmented Memory with Global Attention Network for VideoQA}},
  author    = {Cai, Jiayin and Yuan, Chun and Shi, Cheng and Li, Lei and Cheng, Yangyang and Shan, Ying},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2020},
  pages     = {998-1004},
  doi       = {10.24963/IJCAI.2020/139},
  url       = {https://mlanthology.org/ijcai/2020/cai2020ijcai-feature/}
}