ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation

Abstract

Knowledge transfer from multi-modal, i.e., LiDAR points and images, to a single LiDAR modal can take advantage of complimentary information from modal-fusion but keep a single modal inference speed, showing a promising direction for point cloud semantic segmentation in autonomous driving. Recent advances in point cloud segmentation distill knowledge from strictly aligned point-pixel fusion features while leaving a large number of unmatched image pixels unexplored and unmatched LiDAR points under-benefited. In this paper, we propose a novel approach, named ProtoTransfer, which not only fully exploits image representations but also transfers the learned multi-modal knowledge to all point cloud features. Specifically, based on the basic multi-modal learning framework, we build up a class-wise prototype bank from the strictly-aligned fusion features and encourage all the point cloud features to learn from the prototypes during model training. Moreover, to exploit the massive unmatched point and pixel features, we use a pseudo-labeling scheme and further accumulate these features into the class-wise prototype bank with a carefully designed fusion strategy. Without bells and whistles, our approach demonstrates superior performance over the published state-of-the-arts on two large-scale benchmarks, i.e., nuScenes and SemanticKITTI, and ranks 2nd on the competitive nuScenes Lidarseg challenge leaderboard.

Cite

Text

Tang et al. "ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00309

Markdown

[Tang et al. "ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/tang2023iccv-prototransfer/) doi:10.1109/ICCV51070.2023.00309

BibTeX

@inproceedings{tang2023iccv-prototransfer,
  title     = {{ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation}},
  author    = {Tang, Pin and Xu, Hai-Ming and Ma, Chao},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {3337-3347},
  doi       = {10.1109/ICCV51070.2023.00309},
  url       = {https://mlanthology.org/iccv/2023/tang2023iccv-prototransfer/}
}