A Spatiotemporal Motion Model for Video Summarization
Abstract
The compact description of a video sequence through a single image map and a dominant motion has applications in several domains, including video browsing and retrieval, compression, mosaicing, and visual summarization. Building such a representation requires the capability to register all the frames with respect to the dominant object in the scene, a task which has been, in the past, addressed through temporally localized motion estimates. In this paper, we show how the lack of temporal consistency associated with such estimates can undermine the validity of the dominant motion assumption, leading to oscillation between different scene interpretations and poor registration. To avoid this oscillation, we augment the motion model with a generic temporal constraint which increases the robustness against competing interpretations, leading to more meaningful content summarization.
Cite
Text
Vasconcelos and Lippman. "A Spatiotemporal Motion Model for Video Summarization." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1998. doi:10.1109/CVPR.1998.698631Markdown
[Vasconcelos and Lippman. "A Spatiotemporal Motion Model for Video Summarization." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1998.](https://mlanthology.org/cvpr/1998/vasconcelos1998cvpr-spatiotemporal/) doi:10.1109/CVPR.1998.698631BibTeX
@inproceedings{vasconcelos1998cvpr-spatiotemporal,
title = {{A Spatiotemporal Motion Model for Video Summarization}},
author = {Vasconcelos, Nuno and Lippman, Andrew},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {1998},
pages = {361-366},
doi = {10.1109/CVPR.1998.698631},
url = {https://mlanthology.org/cvpr/1998/vasconcelos1998cvpr-spatiotemporal/}
}