Learning to Cut by Watching Movies

Abstract

Video content creation keeps growing at an incredible pace; yet, creating engaging stories remains challenging and requires non-trivial video editing expertise. Many video editing components are astonishingly hard to automate primarily due to the lack of raw video materials. This paper focuses on a new task for computational video editing, namely the task of raking cut plausibility. Our key idea is to leverage content that has already been edited to learn fine-grained audiovisual patterns that trigger cuts. To do this, we first collected a data source of more than 10K videos, from which we extract more than 260K cuts. We devise a model that learns to discriminate between real and artificial cuts via contrastive learning. We set up a new task and a set of baselines to benchmark video cut generation. We observe that our proposed model outperforms the baselines by large margins. To demonstrate our model in real-world applications, we conduct human studies in a collection of unedited videos. The results show that our model does a better job at cutting than random and alternative baselines.

Cite

Text

Pardo et al. "Learning to Cut by Watching Movies." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00678

Markdown

[Pardo et al. "Learning to Cut by Watching Movies." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/pardo2021iccv-learning/) doi:10.1109/ICCV48922.2021.00678

BibTeX

@inproceedings{pardo2021iccv-learning,
  title     = {{Learning to Cut by Watching Movies}},
  author    = {Pardo, Alejandro and Caba, Fabian and Alcázar, Juan Léon and Thabet, Ali K. and Ghanem, Bernard},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {6858-6868},
  doi       = {10.1109/ICCV48922.2021.00678},
  url       = {https://mlanthology.org/iccv/2021/pardo2021iccv-learning/}
}