Context Graph Based Video Frame Prediction Using Locally Guided Objective

Abstract

This paper proposes a feature reconstruction based approach using pixel-graph and Generative Adversarial Networks (GAN) for solving the problem of synthesizing future frames from video scenes. Recent methods of frame synthesis often generate blurry outcomes in case of long-range prediction and scenes involving multiple objects moving at different velocities due to their holistic approach. Our proposed method introduces a novel pixel-graph based context aggregation layer (PixGraph) which efficiently captures long range dependencies. PixGraph incorporates a weighting scheme through which the internal features of each pixel (or a group of neighboring pixels) can be modeled independently of the others, thus handling the issue of separate objects moving in different directions and with very dissimilar speed. We also introduce a novel objective function, the Locally Guided Gram Loss (LGGL), which aides the GAN based model to maximize the similarity between the intermediate features of the ground-truth and the network output by constructing Gram matrices from locally extracted patches over several levels of the generator. Our proposed model is end-to-end trainable and exhibits superior performance compared to the state-of-the-art on four real-world benchmark video datasets.

Cite

Text

Bhattacharjee and Das. "Context Graph Based Video Frame Prediction Using Locally Guided Objective." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11015-4_15

Markdown

[Bhattacharjee and Das. "Context Graph Based Video Frame Prediction Using Locally Guided Objective." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/bhattacharjee2018eccvw-context/) doi:10.1007/978-3-030-11015-4_15

BibTeX

@inproceedings{bhattacharjee2018eccvw-context,
  title     = {{Context Graph Based Video Frame Prediction Using Locally Guided Objective}},
  author    = {Bhattacharjee, Prateep and Das, Sukhendu},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2018},
  pages     = {169-185},
  doi       = {10.1007/978-3-030-11015-4_15},
  url       = {https://mlanthology.org/eccvw/2018/bhattacharjee2018eccvw-context/}
}