TimeLens-XL: Real-Time Event-Based Video Frame Interpolation with Large Motion

Abstract

Video Frame Interpolation (VFI) aims to predict intermediate frames between consecutive low frame rate inputs. To handle the real-world complex motion between frames, event cameras, which capture high-frequency brightness changes at micro-second temporal resolution, are used to aid interpolation, denoted as Event-VFI. One critical step of Event-VFI is optical flow estimation. Prior methods that adopt either a two-segment formulation or a parametric trajectory model cannot correctly recover large and complex motions between frames, which suffer from accumulated error in flow estimation. To solve this problem, we propose TimeLens-XL, a physically grounded lightweight network that decomposes large motion between two frames into a sequence of small motions for better accuracy. It estimates the entire motion trajectory recursively and samples the bi-directional flow for VFI. Benefiting from the accurate and robust flow prediction, intermediate frames can be efficiently synthesized with simple warping and blending. As a result, the network is extremely lightweight, with only 1/5∼1/10 computational cost and model size of prior works, while also achieving state-of-the-art performance on several challenging benchmarks. To our knowledge, TimeLens-XL is the first real-time (27FPS) Event-VFI algorithm at a resolution of 1280 × 720 using a single RTX 3090 GPU. Furthermore, we have collected a new RGB+Event dataset (HQ-EVFI) consisting of more than 100 challenging scenes with large complex motions and accurately synchronized high-quality RGB-EVS streams. HQ-EVFI addresses several limitations presented in prior datasets and can serve as a new benchmark. Please visit our project website at https://openimaginglab.github.io/TimeLens-XL/ for the code and dataset.

Cite

Text

Guo et al. "TimeLens-XL: Real-Time Event-Based Video Frame Interpolation with Large Motion." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72907-2_11

Markdown

[Guo et al. "TimeLens-XL: Real-Time Event-Based Video Frame Interpolation with Large Motion." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/guo2024eccv-timelensxl/) doi:10.1007/978-3-031-72907-2_11

BibTeX

@inproceedings{guo2024eccv-timelensxl,
  title     = {{TimeLens-XL: Real-Time Event-Based Video Frame Interpolation with Large Motion}},
  author    = {Guo, Shi and Chen, Yutian and Xue, Tianfan and Gu, Jinwei and Ma, Yongrui},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72907-2_11},
  url       = {https://mlanthology.org/eccv/2024/guo2024eccv-timelensxl/}
}