ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation
Abstract
Knowledge transfer from multi-modal, i.e., LiDAR points and images, to a single LiDAR modal can take advantage of complimentary information from modal-fusion but keep a single modal inference speed, showing a promising direction for point cloud semantic segmentation in autonomous driving. Recent advances in point cloud segmentation distill knowledge from strictly aligned point-pixel fusion features while leaving a large number of unmatched image pixels unexplored and unmatched LiDAR points under-benefited. In this paper, we propose a novel approach, named ProtoTransfer, which not only fully exploits image representations but also transfers the learned multi-modal knowledge to all point cloud features. Specifically, based on the basic multi-modal learning framework, we build up a class-wise prototype bank from the strictly-aligned fusion features and encourage all the point cloud features to learn from the prototypes during model training. Moreover, to exploit the massive unmatched point and pixel features, we use a pseudo-labeling scheme and further accumulate these features into the class-wise prototype bank with a carefully designed fusion strategy. Without bells and whistles, our approach demonstrates superior performance over the published state-of-the-arts on two large-scale benchmarks, i.e., nuScenes and SemanticKITTI, and ranks 2nd on the competitive nuScenes Lidarseg challenge leaderboard.
Cite
Text
Tang et al. "ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00309Markdown
[Tang et al. "ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/tang2023iccv-prototransfer/) doi:10.1109/ICCV51070.2023.00309BibTeX
@inproceedings{tang2023iccv-prototransfer,
title = {{ProtoTransfer: Cross-Modal Prototype Transfer for Point Cloud Segmentation}},
author = {Tang, Pin and Xu, Hai-Ming and Ma, Chao},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {3337-3347},
doi = {10.1109/ICCV51070.2023.00309},
url = {https://mlanthology.org/iccv/2023/tang2023iccv-prototransfer/}
}