Neural Rerendering in the Wild

Abstract

We explore total scene capture --- recording, modeling, and rerendering a scene under varying appearance such as season and time of day. Starting from Internet photos of a tourist landmark, we apply traditional 3D reconstruction to register the photos and approximate the scene as a point cloud. For each photo, we render the scene points into a deep framebuffer, and train a deep neural network to learn the mapping of these initial renderings to the actual photos. This rerendering network also takes as input a latent appearance vector and a semantic mask indicating the location of transient objects like pedestrians. The model is evaluated on several datasets of publicly available images spanning a broad range of illumination conditions. We create short videos that demonstrate realistic manipulation of the image viewpoint, appearance, and semantic labels. We also compare results to prior work on scene reconstruction from Internet photos.

Cite

Text

Meshry et al. "Neural Rerendering in the Wild." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00704

Markdown

[Meshry et al. "Neural Rerendering in the Wild." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/meshry2019cvpr-neural/) doi:10.1109/CVPR.2019.00704

BibTeX

@inproceedings{meshry2019cvpr-neural,
  title     = {{Neural Rerendering in the Wild}},
  author    = {Meshry, Moustafa and Goldman, Dan B. and Khamis, Sameh and Hoppe, Hugues and Pandey, Rohit and Snavely, Noah and Martin-Brualla, Ricardo},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00704},
  url       = {https://mlanthology.org/cvpr/2019/meshry2019cvpr-neural/}
}