TVSum: Summarizing Web Videos Using Titles
Abstract
Video summarization is a challenging problem in part because knowing which part of a video is important requires prior knowledge about its main topic. We present TVSum, an unsupervised video summarization framework that uses title-based image search results to find visually important shots. We observe that a video title is often carefully chosen to be maximally descriptive of its main topic, and hence images related to the title can serve as a proxy for important visual concepts of the main topic. However, because titles are free-formed, unconstrained, and often written ambiguously, images searched using the title can contain noise (images irrelevant to video content) and variance (images of different topics). To deal with this challenge, we developed a novel co-archetypal analysis technique that learns canonical visual concepts shared between video and images, but not in either alone, by finding a joint-factorial representation of two data sets. We introduce a new benchmark dataset, TVSum50, that contains 50 videos and their shot-level importance scores annotated via crowdsourcing. Experimental results on two datasets, SumMe and TVSum50, suggest our approach produces superior quality summaries compared to several recently proposed approaches.
Cite
Text
Song et al. "TVSum: Summarizing Web Videos Using Titles." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7299154Markdown
[Song et al. "TVSum: Summarizing Web Videos Using Titles." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/song2015cvpr-tvsum/) doi:10.1109/CVPR.2015.7299154BibTeX
@inproceedings{song2015cvpr-tvsum,
title = {{TVSum: Summarizing Web Videos Using Titles}},
author = {Song, Yale and Vallmitjana, Jordi and Stent, Amanda and Jaimes, Alejandro},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2015},
doi = {10.1109/CVPR.2015.7299154},
url = {https://mlanthology.org/cvpr/2015/song2015cvpr-tvsum/}
}