Dynamic Visual Sequence Prediction with Motion Flow Networks
Abstract
We target the problem of synthesizing future motion sequences from a temporally ordered set of input images. Previous methods tackled this problem in two manners: predicting the future image pixel values and predicting the dense time-space trajectory of pixels. Towards this end, generative encoder-decoder networks have been widely adopted in both kinds of methods. However, pixel prediction with these networks has been shown to suffer from blurry outputs, since images are generated from scratch and there is no explicit enforcement of visual coherency. Alternately, crisp details can be achieved by transferring pixels from the input image through dense trajectory predictions, but this process requires pre-computed motion fields for training, which limit the learning ability for the neural networks. To synthesize realistic movement of objects under weak supervision (without pre-computed dense motion fields), we propose two novel network structures. Our first network encodes the input images as feature maps, and uses a decoder network to predict the future pixel correspondences for a series of subsequent time steps. The attained correspondence fields are then used to synthesize future views. Our second network focuses on human-centered capture by augmenting our framework to include sparse pose estimates [30] to guide our dense correspondence prediction. Compared with state-of-the-art pixel generating and dense trajectories predicting networks, our model performs better on synthetic as well as on real-world human body movement sequences.
Cite
Text
Ji et al. "Dynamic Visual Sequence Prediction with Motion Flow Networks." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018. doi:10.1109/WACV.2018.00119Markdown
[Ji et al. "Dynamic Visual Sequence Prediction with Motion Flow Networks." IEEE/CVF Winter Conference on Applications of Computer Vision, 2018.](https://mlanthology.org/wacv/2018/ji2018wacv-dynamic/) doi:10.1109/WACV.2018.00119BibTeX
@inproceedings{ji2018wacv-dynamic,
title = {{Dynamic Visual Sequence Prediction with Motion Flow Networks}},
author = {Ji, Dinghuang and Wei, Zheng and Dunn, Enrique and Frahm, Jan-Michael},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2018},
pages = {1038-1046},
doi = {10.1109/WACV.2018.00119},
url = {https://mlanthology.org/wacv/2018/ji2018wacv-dynamic/}
}