Multi-Plane Program Induction with 3D Box Priors
Abstract
We consider two important aspects in understanding and editing images: modeling regular, program-like texture or patterns in 2D planes, and 3D posing of these planes in the scene. Unlike prior work on image-based program synthesis, which assumes the image contains a single visible 2D plane, we present Box Program Induction (BPI), which infers a program-like scene representation that simultaneously models repeated structure on multiple 2D planes, the 3D position and orientation of the planes, and camera parameters, all from a single image. Our model assumes a box prior, i.e., that the image captures either an inner view or an outer view of a box in 3D. It uses neural networks to infer visual cues such as vanishing points, wireframe lines to guide a search-based algorithm to find the program that best explains the image. Such a holistic, structured scene representation enables 3D-aware interactive image editing operations such as inpainting missing pixels, changing camera parameters, and extrapolate the image contents.
Cite
Text
Li et al. "Multi-Plane Program Induction with 3D Box Priors." Neural Information Processing Systems, 2020.Markdown
[Li et al. "Multi-Plane Program Induction with 3D Box Priors." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/li2020neurips-multiplane/)BibTeX
@inproceedings{li2020neurips-multiplane,
title = {{Multi-Plane Program Induction with 3D Box Priors}},
author = {Li, Yikai and Mao, Jiayuan and Zhang, Xiuming and Freeman, Bill and Tenenbaum, Josh and Snavely, Noah and Wu, Jiajun},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/li2020neurips-multiplane/}
}