PointInfinity: Resolution-Invariant Point Diffusion Models
Abstract
We present PointInfinity an efficient family of point cloud diffusion models. Our core idea is to use a transformer-based architecture with a fixed-size resolution-invariant latent representation. This enables efficient training with low-resolution point clouds while allowing high-resolution point clouds to be generated during inference. More importantly we show that scaling the test-time resolution beyond the training resolution improves the fidelity of generated point clouds and surfaces. We analyze this phenomenon and draw a link to classifier-free guidance commonly used in diffusion models demonstrating that both allow trading off fidelity and variability during inference. Experiments on CO3D show that PointInfinity can efficiently generate high-resolution point clouds (up to 131k points 31 times more than Point-E) with state-of-the-art quality.
Cite
Text
Huang et al. "PointInfinity: Resolution-Invariant Point Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00958Markdown
[Huang et al. "PointInfinity: Resolution-Invariant Point Diffusion Models." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/huang2024cvpr-pointinfinity/) doi:10.1109/CVPR52733.2024.00958BibTeX
@inproceedings{huang2024cvpr-pointinfinity,
title = {{PointInfinity: Resolution-Invariant Point Diffusion Models}},
author = {Huang, Zixuan and Johnson, Justin and Debnath, Shoubhik and Rehg, James M. and Wu, Chao-Yuan},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {10050-10060},
doi = {10.1109/CVPR52733.2024.00958},
url = {https://mlanthology.org/cvpr/2024/huang2024cvpr-pointinfinity/}
}