NLOST: Non-Line-of-Sight Imaging with Transformer

Abstract

Time-resolved non-line-of-sight (NLOS) imaging is based on the multi-bounce indirect reflections from the hidden objects for 3D sensing. Reconstruction from NLOS measurements remains challenging especially for complicated scenes. To boost the performance, we present NLOST, the first transformer-based neural network for NLOS reconstruction. Specifically, after extracting the shallow features with the assistance of physics-based priors, we design two spatial-temporal self attention encoders to explore both local and global correlations within 3D NLOS data by splitting or downsampling the features into different scales, respectively. Then, we design a spatial-temporal cross attention decoder to integrate local and global features in the token space of transformer, resulting in deep features with high representation capabilities. Finally, deep and shallow features are fused to reconstruct the 3D volume of hidden scenes. Extensive experimental results demonstrate the superior performance of the proposed method over existing solutions on both synthetic data and real-world data captured by different NLOS imaging systems.

Cite

Text

Li et al. "NLOST: Non-Line-of-Sight Imaging with Transformer." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01279

Markdown

[Li et al. "NLOST: Non-Line-of-Sight Imaging with Transformer." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/li2023cvpr-nlost/) doi:10.1109/CVPR52729.2023.01279

BibTeX

@inproceedings{li2023cvpr-nlost,
  title     = {{NLOST: Non-Line-of-Sight Imaging with Transformer}},
  author    = {Li, Yue and Peng, Jiayong and Ye, Juntian and Zhang, Yueyi and Xu, Feihu and Xiong, Zhiwei},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {13313-13322},
  doi       = {10.1109/CVPR52729.2023.01279},
  url       = {https://mlanthology.org/cvpr/2023/li2023cvpr-nlost/}
}