Video Pixel Networks

Nal Kalchbrenner, Aäron Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu

ICML 2017 pp. 1771-1779

/icml/2017/kalchbrenner2017icml-video/

Abstract

We propose a probabilistic video model, the Video Pixel Network (VPN), that estimates the discrete joint distribution of the raw pixel values in a video. The model and the neural architecture reflect the time, space and color structure of video tensors and encode it as a four-dimensional dependency chain. The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth. The VPN also produces detailed samples on the action-conditional Robotic Pushing benchmark and generalizes to the motion of novel objects.

PDF ICML Semantic Scholar

Cite

Text

Kalchbrenner et al. "Video Pixel Networks." International Conference on Machine Learning, 2017.

Markdown

[Kalchbrenner et al. "Video Pixel Networks." International Conference on Machine Learning, 2017.](https://mlanthology.org/icml/2017/kalchbrenner2017icml-video/)

BibTeX

@inproceedings{kalchbrenner2017icml-video,
  title     = {{Video Pixel Networks}},
  author    = {Kalchbrenner, Nal and Oord, Aäron and Simonyan, Karen and Danihelka, Ivo and Vinyals, Oriol and Graves, Alex and Kavukcuoglu, Koray},
  booktitle = {International Conference on Machine Learning},
  year      = {2017},
  pages     = {1771-1779},
  volume    = {70},
  url       = {https://mlanthology.org/icml/2017/kalchbrenner2017icml-video/}
}