Attentive and Adversarial Learning for Video Summarization
Abstract
This paper aims to address the video summarization problem via attention-aware and adversarial training. We formulate the problem as a sequence-to-sequence task, where the input sequence is an original video and the output sequence is its summarization. We propose a GAN-based training framework, which combines the merits of unsupervised and supervised video summarization approaches. The generator is an attention-aware Ptr-Net that generates the cutting points of summarization fragments. The discriminator is a 3D CNN classifier to judge whether a fragment is from a ground-truth or a generated summarization. The experiments show that our method achieves state-of-the-art results on SumMe, TVSum, YouTube, and LoL datasets with 1.5% to 5.6% improvements. Our Ptr-Net generator can overcome the unbalanced training-test length in the seq2seq problem, and our discriminator is effective in leveraging unpaired summarizations to achieve better performance.
Cite
Text
Fu et al. "Attentive and Adversarial Learning for Video Summarization." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00173Markdown
[Fu et al. "Attentive and Adversarial Learning for Video Summarization." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/fu2019wacv-attentive/) doi:10.1109/WACV.2019.00173BibTeX
@inproceedings{fu2019wacv-attentive,
title = {{Attentive and Adversarial Learning for Video Summarization}},
author = {Fu, Tsu-Jui and Tai, Shao-Heng and Chen, Hwann-Tzong},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2019},
pages = {1579-1587},
doi = {10.1109/WACV.2019.00173},
url = {https://mlanthology.org/wacv/2019/fu2019wacv-attentive/}
}