Video Primal Sketch: A Generic Middle-Level Representation of Video
Abstract
This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., ii) FRAME/MRF model with spatio-temporal filters to implicitly represent textured motion, such as water and fire, by matching feature statistics, i.e. histograms. This paper makes three contributions: i) learning a dictionary of video primitives as parametric generative model; ii) studying the Spatio-Temporal FRAME (ST-FRAME) model for modeling and synthesizing textured motion; and iii) developing a parsimonious hybrid model for generic video representation. VPS selects the proper representation automatically and is compatible with high-level action representations. In the experiments, we synthesize a series of dynamic textures, reconstruct real videos and show varying VPS over the change of densities causing by the scale transition in videos.
Cite
Text
Han et al. "Video Primal Sketch: A Generic Middle-Level Representation of Video." IEEE/CVF International Conference on Computer Vision, 2011. doi:10.1109/ICCV.2011.6126380Markdown
[Han et al. "Video Primal Sketch: A Generic Middle-Level Representation of Video." IEEE/CVF International Conference on Computer Vision, 2011.](https://mlanthology.org/iccv/2011/han2011iccv-video/) doi:10.1109/ICCV.2011.6126380BibTeX
@inproceedings{han2011iccv-video,
title = {{Video Primal Sketch: A Generic Middle-Level Representation of Video}},
author = {Han, Zhi and Xu, Zongben and Zhu, Song-Chun},
booktitle = {IEEE/CVF International Conference on Computer Vision},
year = {2011},
pages = {1283-1290},
doi = {10.1109/ICCV.2011.6126380},
url = {https://mlanthology.org/iccv/2011/han2011iccv-video/}
}