Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views

Abstract

We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input. Existing state-of-the-art approaches require dense multi-view supervision and an extensive computational budget. The proposed formulation reliably operates on sparse and wide-baseline multi-view imagery and can be trained efficiently within a few seconds to 10 minutes for hi-res (12MP) content, i.e., 200-400X faster convergence than existing methods. Crucial to our approach are two core novelties: 1) a representation of a pixel that contains color and depth information accumulated from multi-views for a particular location and time along a line of sight, and 2) a multi-layer perceptron (MLP) that enables the composition of this rich information provided for a pixel location to obtain the final color output. We experiment with a large variety of multi-view sequences, compare to existing approaches, and achieve better results in diverse and challenging settings.

Cite

Text

Bansal and Zollhöfer. "Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00036

Markdown

[Bansal and Zollhöfer. "Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/bansal2023cvpr-neural/) doi:10.1109/CVPR52729.2023.00036

BibTeX

@inproceedings{bansal2023cvpr-neural,
  title     = {{Neural Pixel Composition for 3D-4D View Synthesis from Multi-Views}},
  author    = {Bansal, Aayush and Zollhöfer, Michael},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {290-299},
  doi       = {10.1109/CVPR52729.2023.00036},
  url       = {https://mlanthology.org/cvpr/2023/bansal2023cvpr-neural/}
}