ARM: Appearance Reconstruction Model for Relightable 3D Generation

Abstract

Recent image-to-3D reconstruction models have greatly advanced geometry generation, but they still struggle to faithfully generate realistic appearance. To address this, we introduce ARM, a novel method that reconstructs high-quality 3D meshes and realistic appearance from sparse-view images. The core of ARM lies in decoupling geometry from appearance, processing appearance within the UV texture space. Unlike previous methods, ARM improves texture quality by explicitly back-projecting measurements onto the texture map and processing them in a UV space module with a global receptive field. To resolve ambiguities between material and illumination in input images, ARM introduces a material prior that encodes semantic appearance information, enhancing the robustness of appearance decomposition. Trained on just 8 H100 GPUs, ARM outperforms existing methods both quantitatively and qualitatively. Our project page is available at https://arm-aigc.github.io/.

Cite

Text

Feng et al. "ARM: Appearance Reconstruction Model for Relightable 3D Generation." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01996

Markdown

[Feng et al. "ARM: Appearance Reconstruction Model for Relightable 3D Generation." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/feng2025cvpr-arm/) doi:10.1109/CVPR52734.2025.01996

BibTeX

@inproceedings{feng2025cvpr-arm,
  title     = {{ARM: Appearance Reconstruction Model for Relightable 3D Generation}},
  author    = {Feng, Xiang and Yu, Chang and Bi, Zoubin and Shang, Yintong and Gao, Feng and Wu, Hongzhi and Zhou, Kun and Jiang, Chenfanfu and Yang, Yin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {21425-21437},
  doi       = {10.1109/CVPR52734.2025.01996},
  url       = {https://mlanthology.org/cvpr/2025/feng2025cvpr-arm/}
}