Feature Augmented Memory with Global Attention Network for VideoQA
Abstract
Recently, Recurrent Neural Network (RNN) based methods and Self-Attention (SA) based methods have achieved promising performance in Video Question Answering (VideoQA). Despite the success of these works, RNN-based methods tend to forget the global semantic contents due to the inherent drawbacks of the recurrent units themselves, while SA-based methods cannot precisely capture the dependencies of the local neighborhood, leading to insufficient modeling for temporal order. To tackle these problems, we propose a novel VideoQA framework which progressively refines the representations of videos and questions from fine to coarse grain in a sequence-sensitive manner. Specifically, our model improves the feature representations via the following two steps: (1) introducing two fine-grained feature-augmented memories to strengthen the information augmentation of video and text which can improve memory capacity by memorizing more relevant and targeted information. (2) appending the self-attention and co-attention module to the memory output thus the module is able to capture global interaction between high-level semantic informations. Experimental results show that our approach achieves state-of-the-art performance on VideoQA benchmark datasets.
Cite
Text
Cai et al. "Feature Augmented Memory with Global Attention Network for VideoQA." International Joint Conference on Artificial Intelligence, 2020. doi:10.24963/IJCAI.2020/139Markdown
[Cai et al. "Feature Augmented Memory with Global Attention Network for VideoQA." International Joint Conference on Artificial Intelligence, 2020.](https://mlanthology.org/ijcai/2020/cai2020ijcai-feature/) doi:10.24963/IJCAI.2020/139BibTeX
@inproceedings{cai2020ijcai-feature,
title = {{Feature Augmented Memory with Global Attention Network for VideoQA}},
author = {Cai, Jiayin and Yuan, Chun and Shi, Cheng and Li, Lei and Cheng, Yangyang and Shan, Ying},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2020},
pages = {998-1004},
doi = {10.24963/IJCAI.2020/139},
url = {https://mlanthology.org/ijcai/2020/cai2020ijcai-feature/}
}