Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-Training
Abstract
Self-supervised pre-training is essential for 3D point cloud representation learning, as annotating their irregular, topology-free structures is costly and labor-intensive. Masked autoencoders (MAEs) offer a promising framework but rely on explicit positional embeddings, such as patch center coordinates, which leak geometric information and limit data-driven structural learning. In this work, we propose Point-MaDi, a novel Point cloud Masked autoencoding Diffusion framework for pre-training that integrates a dual-diffusion pretext task into an MAE architecture to address this issue. Specifically, we introduce a center diffusion mechanism in the encoder, noising and predicting the coordinates of both visible and masked patch centers without ground-truth positional embeddings. These predicted centers are processed using a transformer with self-attention and cross-attention to capture intra- and inter-patch relationships. In the decoder, we design a conditional patch diffusion process, guided by the encoder's latent features and predicted centers to reconstruct masked patches directly from noise. This dual-diffusion design drives comprehensive global semantic and local geometric representations during pre-training, eliminating external geometric priors. Extensive experiments on ScanObjectNN, ModelNet40, ShapeNetPart, S3DIS, and ScanNet demonstrate that Point-MaDi achieves superior performance across downstream tasks, surpassing Point-MAE by 5.51\% on OBJ-BG, 5.17\% on OBJ-ONLY, and 4.34\% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset.
Cite
Text
Xiao et al. "Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-Training." Advances in Neural Information Processing Systems, 2025.Markdown
[Xiao et al. "Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-Training." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/xiao2025neurips-pointmadi/)BibTeX
@inproceedings{xiao2025neurips-pointmadi,
title = {{Point-MaDi: Masked Autoencoding with Diffusion for Point Cloud Pre-Training}},
author = {Xiao, Xiaoyang and Yao, Runzhao and Tian, Zhiqiang and Du, Shaoyi},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/xiao2025neurips-pointmadi/}
}