SNI-SLAM: Semantic Neural Implicit SLAM

Abstract

We propose SNI-SLAM a semantic SLAM system utilizing neural implicit representation that simultaneously performs accurate semantic mapping high-quality surface reconstruction and robust camera tracking. In this system we introduce hierarchical semantic representation to allow multi-level semantic comprehension for top-down structured semantic mapping of the scene. In addition to fully utilize the correlation between multiple attributes of the environment we integrate appearance geometry and semantic features through cross-attention for feature collaboration. This strategy enables a more multifaceted understanding of the environment thereby allowing SNI-SLAM to remain robust even when single attribute is defective. Then we design an internal fusion-based decoder to obtain semantic RGB Truncated Signed Distance Field (TSDF) values from multi-level features for accurate decoding. Furthermore we propose a feature loss to update the scene representation at the feature level. Compared with low-level losses such as RGB loss and depth loss our feature loss is capable of guiding the network optimization on a higher-level. Our SNI-SLAM method demonstrates superior performance over all recent NeRF-based SLAM methods in terms of mapping and tracking accuracy on Replica and ScanNet datasets while also showing excellent capabilities in accurate semantic segmentation and real-time semantic mapping. Codes will be available at https://github.com/IRMVLab/SNI-SLAM.

Cite

Text

Zhu et al. "SNI-SLAM: Semantic Neural Implicit SLAM." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02000

Markdown

[Zhu et al. "SNI-SLAM: Semantic Neural Implicit SLAM." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/zhu2024cvpr-snislam/) doi:10.1109/CVPR52733.2024.02000

BibTeX

@inproceedings{zhu2024cvpr-snislam,
  title     = {{SNI-SLAM: Semantic Neural Implicit SLAM}},
  author    = {Zhu, Siting and Wang, Guangming and Blum, Hermann and Liu, Jiuming and Song, Liang and Pollefeys, Marc and Wang, Hesheng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {21167-21177},
  doi       = {10.1109/CVPR52733.2024.02000},
  url       = {https://mlanthology.org/cvpr/2024/zhu2024cvpr-snislam/}
}