E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning

Qu, Qiang; Shen, Yiran; Chen, Xiaoming; Chung, Yuk Ying; Liu, Tongliang

doi:10.1609/AAAI.V38I5.28263

E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning

Qiang Qu, Yiran Shen, Xiaoming Chen, Yuk Ying Chung, Tongliang Liu

AAAI 2024 pp. 4632-4640

doi:10.1609/AAAI.V38I5.28263 /aaai/2024/qu2024aaai-e/

Abstract

The bio-inspired event cameras or dynamic vision sensors are capable of asynchronously capturing per-pixel brightness changes (called event-streams) in high temporal resolution and high dynamic range. However, the non-structural spatial-temporal event-streams make it challenging for providing intuitive visualization with rich semantic information for human vision. It calls for events-to-video (E2V) solutions which take event-streams as input and generate high quality video frames for intuitive visualization. However, current solutions are predominantly data-driven without considering the prior knowledge of the underlying statistics relating event-streams and video frames. It highly relies on the non-linearity and generalization capability of the deep neural networks, thus, is struggling on reconstructing detailed textures when the scenes are complex. In this work, we propose E2HQV, a novel E2V paradigm designed to produce high-quality video frames from events. This approach leverages a model-aided deep learning framework, underpinned by a theory-inspired E2V model, which is meticulously derived from the fundamental imaging principles of event cameras. To deal with the issue of state-reset in the recurrent components of E2HQV, we also design a temporal shift embedding module to further improve the quality of the video frames. Comprehensive evaluations on the real world event camera datasets validate our approach, with E2HQV, notably outperforming state-of-the-art approaches, e.g., surpassing the second best by over 40% for some evaluation metrics.

PDF AAAI Semantic Scholar

Cite

Text

Qu et al. "E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I5.28263

Markdown

[Qu et al. "E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/qu2024aaai-e/) doi:10.1609/AAAI.V38I5.28263

BibTeX

@inproceedings{qu2024aaai-e,
  title     = {{E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning}},
  author    = {Qu, Qiang and Shen, Yiran and Chen, Xiaoming and Chung, Yuk Ying and Liu, Tongliang},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4632-4640},
  doi       = {10.1609/AAAI.V38I5.28263},
  url       = {https://mlanthology.org/aaai/2024/qu2024aaai-e/}
}