SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition

Abstract

Simultaneous Localization and Mapping (SLAM) and Autonomous Driving are becoming increasingly more important in recent years. Point cloud-based large scale place recognition is the spine of them. While many models have been proposed and have achieved acceptable performance by learning short-range local features, they always skip long-range contextual properties. Moreover, the model size also becomes a serious shackle for their wide applications. To overcome these challenges, we propose a super light-weight network model termed SVT-Net. On top of the highly efficient 3D Sparse Convolution (SP-Conv), an Atom-based Sparse Voxel Transformer (ASVT) and a Cluster-based Sparse Voxel Transformer (CSVT) are proposed respectively to learn both short-range local features and long-range contextual features. Consisting of ASVT and CSVT, SVT-Net can achieve state-of-the-art performance in terms of both recognition accuracy and running speed with a super-light model size (0.9M parameters). Meanwhile, for the purpose of further boosting efficiency, we introduce two simplified versions, which also achieve state-of-the-art performance and further reduce the model size to 0.8M and 0.4M respectively.

Cite

Text

Fan et al. "SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I1.19934

Markdown

[Fan et al. "SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/fan2022aaai-svt/) doi:10.1609/AAAI.V36I1.19934

BibTeX

@inproceedings{fan2022aaai-svt,
  title     = {{SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition}},
  author    = {Fan, Zhaoxin and Song, Zhenbo and Liu, Hongyan and Lu, Zhiwu and He, Jun and Du, Xiaoyong},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {551-560},
  doi       = {10.1609/AAAI.V36I1.19934},
  url       = {https://mlanthology.org/aaai/2022/fan2022aaai-svt/}
}