Sparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation

Abstract

Two major challenges of 3D LiDAR Panoptic Segmentation (PS) are that point clouds of an object are surface-aggregated and thus hard to model the long-range dependency especially for large instances, and that objects are too close to separate each other. Recent literature addresses these problems by time-consuming grouping processes such as dual-clustering, mean-shift offsets and etc., or by bird-eye-view (BEV) dense centroid representation that downplays geometry. However, the long-range geometry relationship has not been sufficiently modeled by local feature learning from the above methods. To this end, we present SCAN, a novel sparse cross-scale attention network to first align multi-scale sparse features with global voxel-encoded attention to capture the long-range relationship of instance context, which is able to boost the regression accuracy of the over-segmented large objects. For the surface-aggregated points, SCAN adopts a novel sparse class-agnostic representation of instance centroids, which can not only maintain the sparsity of aligned features to solve the under-segmentation on small objects, but also reduce the computation amount of the network through sparse convolution. Our method outperforms previous methods by a large margin in the SemanticKITTI dataset for the challenging 3D PS task, achieving 1st place with a real-time inference speed.

Cite

Text

Xu et al. "Sparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I3.20197

Markdown

[Xu et al. "Sparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/xu2022aaai-sparse/) doi:10.1609/AAAI.V36I3.20197

BibTeX

@inproceedings{xu2022aaai-sparse,
  title     = {{Sparse Cross-Scale Attention Network for Efficient LiDAR Panoptic Segmentation}},
  author    = {Xu, Shuangjie and Wan, Rui and Ye, Maosheng and Zou, Xiaoyi and Cao, Tongyi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {2920-2928},
  doi       = {10.1609/AAAI.V36I3.20197},
  url       = {https://mlanthology.org/aaai/2022/xu2022aaai-sparse/}
}