MetaPix: Few-Shot Video Retargeting

Lee, Jessica; Ramanan, Deva; Girdhar, Rohit

MetaPix: Few-Shot Video Retargeting

Jessica Lee, Deva Ramanan, Rohit Girdhar

ICLR 2020

/iclr/2020/lee2020iclr-metapix/

Abstract

We address the task of unsupervised retargeting of human actions from one video to another. We consider the challenging setting where only a few frames of the target is available. The core of our approach is a conditional generative model that can transcode input skeletal poses (automatically extracted with an off-the-shelf pose estimator) to output target frames. However, it is challenging to build a universal transcoder because humans can appear wildly different due to clothing and background scene geometry. Instead, we learn to adapt – or personalize – a universal generator to the particular human and background in the target. To do so, we make use of meta-learning to discover effective strategies for on-the-fly personalization. One significant benefit of meta-learning is that the personalized transcoder naturally enforces temporal coherence across its generated frames; all frames contain consistent clothing and background geometry of the target. We experiment on in-the-wild internet videos and images and show our approach improves over widely-used baselines for the task.

PDF ICLR Semantic Scholar

Cite

Text

Lee et al. "MetaPix: Few-Shot Video Retargeting." International Conference on Learning Representations, 2020.

Markdown

[Lee et al. "MetaPix: Few-Shot Video Retargeting." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/lee2020iclr-metapix/)

BibTeX

@inproceedings{lee2020iclr-metapix,
  title     = {{MetaPix: Few-Shot Video Retargeting}},
  author    = {Lee, Jessica and Ramanan, Deva and Girdhar, Rohit},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/lee2020iclr-metapix/}
}