Deep Optics for Video Snapshot Compressive Imaging

Abstract

Video snapshot compressive imaging (SCI) aims to capture a sequence of video frames with only a single shot of a 2D detector, whose backbones rest in optical modulation patterns (also known as masks) and a computational reconstruction algorithm. Advanced deep learning algorithms and mature hardware are putting video SCI into practical applications. Yet, there are two clouds in the sunshine of SCI: i) low dynamic range as a victim of high temporal multiplexing, and ii) existing deep learning algorithms' degradation on real system. To address these challenges, this paper presents a deep optics framework to jointly optimize masks and a reconstruction network. Specifically, we first propose a new type of structural mask to realize motionaware and full-dynamic-range measurement. Considering the motion awareness property in measurement domain, we develop an efficient network for video SCI reconstruction using Transformer to capture long-term temporal dependencies, dubbed Res2former. Moreover, sensor response is introduced into the forward model of video SCI to guarantee end-to-end model training close to real system. Finally, we implement the learned structural masks on a digital micro-mirror device. Experimental results on synthetic and real data validate the effectiveness of the proposed framework. We believe this is a miestone for real-world video SCI. The source code and data are available at https://github.com/pwangcs/DeepOpticsSCI.

Cite

Text

Wang et al. "Deep Optics for Video Snapshot Compressive Imaging." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00977

Markdown

[Wang et al. "Deep Optics for Video Snapshot Compressive Imaging." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/wang2023iccv-deep/) doi:10.1109/ICCV51070.2023.00977

BibTeX

@inproceedings{wang2023iccv-deep,
  title     = {{Deep Optics for Video Snapshot Compressive Imaging}},
  author    = {Wang, Ping and Wang, Lishun and Yuan, Xin},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {10646-10656},
  doi       = {10.1109/ICCV51070.2023.00977},
  url       = {https://mlanthology.org/iccv/2023/wang2023iccv-deep/}
}