Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding
Abstract
A series of pre-trained models have demonstrated promising results in point cloud understanding tasks and are widely applied to downstream tasks through fine-tuning. However, full fine-tuning leads to the forgetting of pretrained knowledge and substantial storage costs on edge devices. To address these issues, Parameter-Efficient Transfer Learning (PETL) methods have been proposed. According to our analysis, we find that existing 3D PETL methods cannot adequately align with semantic relationships of features required by downstream tasks, resulting in suboptimal performance. To ensure parameter efficiency while introducing rich semantic cues, we propose a novel fine-tuning paradigm for 3D pre-trained models. We utilize frozen 2D pre-trained models to provide vision semantic prompts and design a new Hybrid Attention Adapter to efficiently fuse 2D semantic cues into 3D representations with minimal trainable parameters(1.8M). Extensive experiments conducted on datasets including ScanObjectNN, ModelNet40, and ShapeNetPart demonstrate the effectiveness of our proposed paradigm. In particular, our method achieves 95.6% accuracy on ModelNet40 and attains 90.09% performance on the most challenging classification split ScanObjectNN(PB-T50-RS).
Cite
Text
Zha et al. "Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Zha et al. "Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/zha2025icml-exploring/)BibTeX
@inproceedings{zha2025icml-exploring,
title = {{Exploring Vision Semantic Prompt for Efficient Point Cloud Understanding}},
author = {Zha, Yixin and Wang, Chuxin and Yang, Wenfei and Zhang, Tianzhu and Wu, Feng},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {74271-74287},
volume = {267},
url = {https://mlanthology.org/icml/2025/zha2025icml-exploring/}
}