Slot-Guided Volumetric Object Radiance Fields

Abstract

We present a novel framework for 3D object-centric representation learning. Our approach effectively decomposes complex scenes into individual objects from a single image in an unsupervised fashion. This method, called \underline{s}lot-guided \underline{V}olumetric \underline{O}bject \underline{R}adiance \underline{F}ields~(sVORF), composes volumetric object radiance fields with object slots as a guidance to implement unsupervised 3D scene decomposition. Specifically, sVORF obtains object slots from a single image via a transformer module, maps these slots to volumetric object radiance fields with a hypernetwork and composes object radiance fields with the guidance of object slots at a 3D location. Moreover, sVORF significantly reduces memory requirement due to small-sized pixel rendering during training. We demonstrate the effectiveness of our approach by showing top results in scene decomposition and generation tasks of complex synthetic datasets (e.g., Room-Diverse). Furthermore, we also confirm the potential of sVORF to segment objects in real-world scenes (e.g., the LLFF dataset). We hope our approach can provide preliminary understanding of the physical world and help ease future research in 3D object-centric representation learning.

Cite

Text

Qi et al. "Slot-Guided Volumetric Object Radiance Fields." Neural Information Processing Systems, 2023.

Markdown

[Qi et al. "Slot-Guided Volumetric Object Radiance Fields." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/qi2023neurips-slotguided/)

BibTeX

@inproceedings{qi2023neurips-slotguided,
  title     = {{Slot-Guided Volumetric Object Radiance Fields}},
  author    = {Qi, Di and Yang, Tong and Zhang, Xiangyu},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/qi2023neurips-slotguided/}
}