Video Editing via Factorized Diffusion Distillation

Singer, Uriel; Zohar, Amit; Kirstain, Yuval; Sheynin, Shelly; Polyak, Adam; Parikh, Devi; Taigman, Yaniv

doi:10.1007/978-3-031-73116-7_26

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman

ECCV 2024

doi:10.1007/978-3-031-73116-7_26 /eccv/2024/singer2024eccv-video/

Abstract

We introduce , a model that establishes a new state-of-the art in video editing without relying on any supervised video editing data. To develop we separately train an image editing adapter and a video generation adapter, and attach both to the same text-to-image model. Then, to align the adapters towards video editing we introduce a new unsupervised distillation procedure, . This procedure distills knowledge from one or more teachers simultaneously, without any supervised data. We utilize this procedure to teach to edit videos by jointly distilling knowledge to (i) precisely edit each individual frame from the image editing adapter, and (ii) ensure temporal consistency among the edited frames using the video generation adapter. Finally, to demonstrate the potential of our approach in unlocking other capabilities, we align additional combinations of adapters.

PDF ECCV Semantic Scholar

Cite

Text

Singer et al. "Video Editing via Factorized Diffusion Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73116-7_26

Markdown

[Singer et al. "Video Editing via Factorized Diffusion Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/singer2024eccv-video/) doi:10.1007/978-3-031-73116-7_26

BibTeX

@inproceedings{singer2024eccv-video,
  title     = {{Video Editing via Factorized Diffusion Distillation}},
  author    = {Singer, Uriel and Zohar, Amit and Kirstain, Yuval and Sheynin, Shelly and Polyak, Adam and Parikh, Devi and Taigman, Yaniv},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73116-7_26},
  url       = {https://mlanthology.org/eccv/2024/singer2024eccv-video/}
}