Language-Driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates

Ka Chun Shum, Jaeyeon Kim, Binh-Son Hua, Duc Thanh Nguyen, Sai-Kit Yeung

CVPR 2024 pp. 5176-5187

doi:10.1109/CVPR52733.2024.00495 /cvpr/2024/shum2024cvpr-languagedriven/

Abstract

Neural radiance field (NeRF) is an emerging technique for 3D scene reconstruction and modeling. However current NeRF-based methods are limited in the capabilities of adding or removing objects. This paper fills the aforementioned gap by proposing a new language-driven method for object manipulation in NeRFs through dataset updates. Specifically to insert an object represented by a set of multi-view images into a background NeRF we use a text-to-image diffusion model to blend the object into the given background across views. The generated images are then used to update the NeRF so that we can render view-consistent images of the object within the background. To ensure view consistency we propose a dataset update strategy that prioritizes the radiance field training based on camera poses in a pose-ordered manner. We validate our method in two case studies: object insertion and object removal. Experimental results show that our method can generate photo-realistic results and achieves state-of-the-art performance in NeRF editing.

PDF CVPR Semantic Scholar

Cite

Text

Shum et al. "Language-Driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00495

Markdown

[Shum et al. "Language-Driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/shum2024cvpr-languagedriven/) doi:10.1109/CVPR52733.2024.00495

BibTeX

@inproceedings{shum2024cvpr-languagedriven,
  title     = {{Language-Driven Object Fusion into Neural Radiance Fields with Pose-Conditioned Dataset Updates}},
  author    = {Shum, Ka Chun and Kim, Jaeyeon and Hua, Binh-Son and Nguyen, Duc Thanh and Yeung, Sai-Kit},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {5176-5187},
  doi       = {10.1109/CVPR52733.2024.00495},
  url       = {https://mlanthology.org/cvpr/2024/shum2024cvpr-languagedriven/}
}