3D Mesh Editing Using Masked LRMs

Abstract

We present a novel approach to mesh shape editing, building on recent progress in 3D reconstruction from multi-view images. We formulate shape editing as a conditional reconstruction problem, where the model must reconstruct the input shape with the exception of a specified 3D region, in which the geometry should be generated from the conditional signal. To this end, we train a conditional Large Reconstruction Model (LRM) for masked reconstruction, using multi-view consistent masks rendered from a randomly generated 3D occlusion, and using one clean viewpoint as the conditional signal. During inference, we manually define a 3D region to edit and provide an edited image from a canonical viewpoint to fill that region. We demonstrate that, in just a single forward pass, our method not only preserves the input geometry in the unmasked region through reconstruction capabilities on par with SoTA, but is also expressive enough to perform a variety of mesh edits from a single image guidance that past works struggle with, while being 2-10 times faster than the top-performing prior work.

Cite

Text

Gao et al. "3D Mesh Editing Using Masked LRMs." International Conference on Computer Vision, 2025.

Markdown

[Gao et al. "3D Mesh Editing Using Masked LRMs." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/gao2025iccv-3d/)

BibTeX

@inproceedings{gao2025iccv-3d,
  title     = {{3D Mesh Editing Using Masked LRMs}},
  author    = {Gao, Will and Wang, Dilin and Fan, Yuchen and Bozic, Aljaz and Stuyck, Tuur and Li, Zhengqin and Dong, Zhao and Ranjan, Rakesh and Sarafianos, Nikolaos},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {7154-7165},
  url       = {https://mlanthology.org/iccv/2025/gao2025iccv-3d/}
}