A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions
Abstract
Recently, many self-supervised pre-training methods have been proposed to improve the performance of deep neural networks (DNNs) for 3D point clouds processing. However, the common mechanism underlying the effectiveness of different pre-training methods remains unclear. In this paper, we use game-theoretic interactions as a unified approach to explore the common mechanism of pre-training methods. Specifically, we decompose the output score of a DNN into the sum of numerous effects of interactions, with each interaction representing a distinct 3D substructure of the input point cloud. Based on the decomposed interactions, we draw the following conclusions. (1) The common mechanism across different pre-training methods is that they enhance the strength of high-order interactions encoded by DNNs, which represent complex and global 3D structures, while reducing the strength of low-order interactions, which represent simple and local 3D structures. (2) Sufficient pre-training and adequate fine-tuning data for downstream tasks further reinforce the mechanism described above. (3) Pre-training methods carry a potential risk of reducing the transferability of features encoded by DNNs. Inspired by the observed common mechanism, we propose a new method to directly enhance the strength of high-order interactions and reduce the strength of low-order interactions encoded by DNNs, improving performance without the need for pre-training on large-scale datasets. Experiments show that our method achieves performance comparable to traditional pre-training methods.
Cite
Text
Li et al. "A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02544Markdown
[Li et al. "A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/li2025cvpr-unified/) doi:10.1109/CVPR52734.2025.02544BibTeX
@inproceedings{li2025cvpr-unified,
title = {{A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions}},
author = {Li, Qiang and Ruan, Jian and Wu, Fanghao and Chen, Yuchi and Wei, Zhihua and Shen, Wen},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {27315-27324},
doi = {10.1109/CVPR52734.2025.02544},
url = {https://mlanthology.org/cvpr/2025/li2025cvpr-unified/}
}