ScanEdit: Hierarchically-Guided Functional 3D Scan Editing
Abstract
With the growing ease of capture of real-world 3D scenes, effective editing becomes essential for the use of captured 3D scan data in various graphics applications.We present ScanEdit, which enables functional editing of complex, real-world 3D scans from natural language text prompts.By leveraging the high-level reasoning capabilities of large language models (LLMs), we construct a hierarchical scene graph representation for an input 3D scan given its instance decomposition. We develop a hierarchically-guided, multi-stage prompting approach using LLMs to decompose general language instructions (that can be vague, without referencing specific objects) into specific, actionable constraints that are applied to our scene graph. Our scene optimization integrates LLM-guided constraints along with 3D-based physical plausibility objectives, enabling the generation of edited scenes that align with a variety of input prompts, from abstract, functional-based goals to more detailed, specific instructions. This establishes a foundation for intuitive, text-driven 3D scene editing in real-world scenes.
Cite
Text
Boudjoghra et al. "ScanEdit: Hierarchically-Guided Functional 3D Scan Editing." International Conference on Computer Vision, 2025.Markdown
[Boudjoghra et al. "ScanEdit: Hierarchically-Guided Functional 3D Scan Editing." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/boudjoghra2025iccv-scanedit/)BibTeX
@inproceedings{boudjoghra2025iccv-scanedit,
title = {{ScanEdit: Hierarchically-Guided Functional 3D Scan Editing}},
author = {Boudjoghra, Mohamed El Amine and Laptev, Ivan and Dai, Angela},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {27105-27115},
url = {https://mlanthology.org/iccv/2025/boudjoghra2025iccv-scanedit/}
}