Modeling Video Dynamics with Deep Dynencoder
Abstract
Videos always exhibit various pattern motions, which can be modeled according to dynamics between adjacent frames. Previous methods based on linear dynamic system can model dynamic textures but have limited capacity of representing sophisticated nonlinear dynamics. Inspired by the nonlinear expression power of deep autoencoders, we propose a novel model named dynencoder which has an autoencoder at the bottom and a variant of it at the top (named as dynpredictor). It generates hidden states from raw pixel inputs via the autoencoder and then encodes the dynamic of state transition over time via the dynpredictor. Deep dynencoder can be constructed by proper stacking strategy and trained by layer-wise pre-training and joint fine-tuning. Experiments verify that our model can describe sophisticated video dynamics and synthesize endless video texture sequences with high visual quality. We also design classification and clustering methods based on our model and demonstrate the efficacy of them on traffic scene classification and motion segmentation.
Cite
Text
Yan et al. "Modeling Video Dynamics with Deep Dynencoder." European Conference on Computer Vision, 2014. doi:10.1007/978-3-319-10593-2_15Markdown
[Yan et al. "Modeling Video Dynamics with Deep Dynencoder." European Conference on Computer Vision, 2014.](https://mlanthology.org/eccv/2014/yan2014eccv-modeling/) doi:10.1007/978-3-319-10593-2_15BibTeX
@inproceedings{yan2014eccv-modeling,
title = {{Modeling Video Dynamics with Deep Dynencoder}},
author = {Yan, Xing and Chang, Hong and Shan, Shiguang and Chen, Xilin},
booktitle = {European Conference on Computer Vision},
year = {2014},
pages = {215-230},
doi = {10.1007/978-3-319-10593-2_15},
url = {https://mlanthology.org/eccv/2014/yan2014eccv-modeling/}
}