One Style Is All You Need to Generate a Video

Abstract

In this paper, we propose a style-based conditional video generative model. We introduce a novel temporal generator based on a set of learned sinusoidal bases. Our method learns dynamic representations of various actions that are independent of image content and can be transferred between different actors. Beyond the significant enhancement of video quality compared to prevalent methods, we demonstrate that the disentangled dynamic and content permit their independent manipulation, as well as temporal GAN-inversion to retrieve and transfer a video motion from one content or identity to another without further preprocessing such as landmark points.

Cite

Text

Manandhar and Genovesio. "One Style Is All You Need to Generate a Video." Winter Conference on Applications of Computer Vision, 2024.

Markdown

[Manandhar and Genovesio. "One Style Is All You Need to Generate a Video." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/manandhar2024wacv-one/)

BibTeX

@inproceedings{manandhar2024wacv-one,
  title     = {{One Style Is All You Need to Generate a Video}},
  author    = {Manandhar, Sandeep and Genovesio, Auguste},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2024},
  pages     = {5038-5047},
  url       = {https://mlanthology.org/wacv/2024/manandhar2024wacv-one/}
}