A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions

Abstract

Recently, many self-supervised pre-training methods have been proposed to improve the performance of deep neural networks (DNNs) for 3D point clouds processing. However, the common mechanism underlying the effectiveness of different pre-training methods remains unclear. In this paper, we use game-theoretic interactions as a unified approach to explore the common mechanism of pre-training methods. Specifically, we decompose the output score of a DNN into the sum of numerous effects of interactions, with each interaction representing a distinct 3D substructure of the input point cloud. Based on the decomposed interactions, we draw the following conclusions. (1) The common mechanism across different pre-training methods is that they enhance the strength of high-order interactions encoded by DNNs, which represent complex and global 3D structures, while reducing the strength of low-order interactions, which represent simple and local 3D structures. (2) Sufficient pre-training and adequate fine-tuning data for downstream tasks further reinforce the mechanism described above. (3) Pre-training methods carry a potential risk of reducing the transferability of features encoded by DNNs. Inspired by the observed common mechanism, we propose a new method to directly enhance the strength of high-order interactions and reduce the strength of low-order interactions encoded by DNNs, improving performance without the need for pre-training on large-scale datasets. Experiments show that our method achieves performance comparable to traditional pre-training methods.

Cite

Text

Li et al. "A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02544

Markdown

[Li et al. "A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/li2025cvpr-unified/) doi:10.1109/CVPR52734.2025.02544

BibTeX

@inproceedings{li2025cvpr-unified,
  title     = {{A Unified Approach to Interpreting Self-Supervised Pre-Training Methods for 3D Point Clouds via Interactions}},
  author    = {Li, Qiang and Ruan, Jian and Wu, Fanghao and Chen, Yuchi and Wei, Zhihua and Shen, Wen},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {27315-27324},
  doi       = {10.1109/CVPR52734.2025.02544},
  url       = {https://mlanthology.org/cvpr/2025/li2025cvpr-unified/}
}