Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet
Abstract
Video sequences contain rich dynamic patterns, such as dynamic texture patterns that exhibit stationarity in the temporal domain, and action patterns that are non-stationary in either spatial or temporal domain. We show that a spatial-temporal generative ConvNet can be used to model and synthesize dynamic patterns. The model defines a probability distribution on the video sequence, and the log probability is defined by a spatial-temporal ConvNet that consists of multiple layers of spatial-temporal filters to capture spatial-temporal patterns of different scales. The model can be learned from the training video sequences by an "analysis by synthesis" learning algorithm that iterates the following two steps. Step 1 synthesizes video sequences from the currently learned model. Step 2 then updates the model parameters based on the difference between the synthesized video sequences and the observed training sequences. We show that the learning algorithm can synthesize realistic dynamic patterns.
Cite
Text
Xie et al. "Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.119Markdown
[Xie et al. "Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/xie2017cvpr-synthesizing/) doi:10.1109/CVPR.2017.119BibTeX
@inproceedings{xie2017cvpr-synthesizing,
title = {{Synthesizing Dynamic Patterns by Spatial-Temporal Generative ConvNet}},
author = {Xie, Jianwen and Zhu, Song-Chun and Wu, Ying Nian},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2017},
doi = {10.1109/CVPR.2017.119},
url = {https://mlanthology.org/cvpr/2017/xie2017cvpr-synthesizing/}
}