Temporal Generative Adversarial Nets with Singular Value Clipping

Abstract

In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods.

Cite

Text

Saito et al. "Temporal Generative Adversarial Nets with Singular Value Clipping." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.308

Markdown

[Saito et al. "Temporal Generative Adversarial Nets with Singular Value Clipping." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/saito2017iccv-temporal/) doi:10.1109/ICCV.2017.308

BibTeX

@inproceedings{saito2017iccv-temporal,
  title     = {{Temporal Generative Adversarial Nets with Singular Value Clipping}},
  author    = {Saito, Masaki and Matsumoto, Eiichi and Saito, Shunta},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.308},
  url       = {https://mlanthology.org/iccv/2017/saito2017iccv-temporal/}
}