Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning
Abstract
Learning semantically meaningful representations from unstructured 3D point clouds remains a central challenge in computer vision, especially in the absence of large-scale labeled datasets. While masked point modeling (MPM) is widely used in self-supervised 3D learning, its reconstruction-based objective can limit its ability to capture high-level semantics. We propose AsymDSD, an Asymmetric Dual Self-Distillation framework that unifies masked modeling and invariance learning through prediction in the latent space rather than the input space. AsymDSD builds on a joint embedding architecture and introduces several key design choices: an efficient asymmetric setup, disabling attention between masked queries to prevent shape leakage, multi-mask sampling, and a point cloud adaptation of multi-crop. AsymDSD achieves state-of-the-art results on ScanObjectNN (90.53\%) and further improves to 93.72\% when pretrained on 930k shapes, surpassing prior methods.
Cite
Text
Leijenaar and Kasaei. "Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning." Advances in Neural Information Processing Systems, 2025.Markdown
[Leijenaar and Kasaei. "Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/leijenaar2025neurips-asymmetric/)BibTeX
@inproceedings{leijenaar2025neurips-asymmetric,
title = {{Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning}},
author = {Leijenaar, Remco F. and Kasaei, Hamidreza},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/leijenaar2025neurips-asymmetric/}
}