ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation
Abstract
3D world models (i.e., learning-based 3D dynamics models) offer a promising approach to generalizable robotic manipulation by capturing the underlying physics of environment evolution conditioned on robot actions. However, existing 3D world models are primarily limited to single-material dynamics using a particle-based Graph Neural Network model, and often require time-consuming 3D scene reconstruction to obtain 3D particle tracks for training. In this work, we present ParticleFormer, a Transformer-based point cloud world model trained with a hybrid point cloud reconstruction loss, supervising both global and local dynamics features in multi-material, multi-object robot interactions. ParticleFormer captures fine-grained multi-object interactions between rigid, deformable, and flexible materials, trained directly from real-world robot perception data without an elaborate scene reconstruction. We demonstrate the model’s effectiveness both in 3D scene forecasting tasks, and in downstream manipulation tasks using a Model Predictive Control (MPC) policy. In addition, we extend existing dynamics learning benchmarks to include diverse multi-material, multi-object interaction scenarios. We validate our method on six simulation and three real-world experiments, where it consistently outperforms leading baselines by achieving superior dynamics prediction accuracy and less rollout error in downstream visuomotor tasks. Experimental videos are available at https://particleformer.github.io/.
Cite
Text
Huang et al. "ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation." Proceedings of The 9th Conference on Robot Learning, 2025.Markdown
[Huang et al. "ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/huang2025corl-particleformer/)BibTeX
@inproceedings{huang2025corl-particleformer,
title = {{ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation}},
author = {Huang, Suning and Chen, Qianzhong and Zhang, Xiaohan and Sun, Jiankai and Schwager, Mac},
booktitle = {Proceedings of The 9th Conference on Robot Learning},
year = {2025},
pages = {4941-4957},
volume = {305},
url = {https://mlanthology.org/corl/2025/huang2025corl-particleformer/}
}