Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency

Abstract

Approaches for single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry. We avoid all such supervision and assumptions by explicitly leveraging the consistency between images of different object instances. As a result, our method can learn from large collections of unlabelled images depicting the same object category. Our main contributions are two ways for leveraging cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; and (ii) neighbor reconstruction, a loss enforcing consistency between instances having similar shape or texture. Also critical to the success of our method are: our structured autoencoding architecture decomposing an image into explicit shape, texture, pose, and background; an adapted formulation of differential rendering; and a new optimization scheme alternating between 3D and pose learning. We compare our approach, UNICORN, both on the diverse synthetic ShapeNet dataset - the classical benchmark for methods requiring multiple views as supervision - and on standard real-image benchmarks (Pascal3D+ Car, CUB) for which most methods require known templates and silhouette annotations. We also showcase applicability to more challenging real-world collections (CompCars, LSUN), where silhouettes are not available and images are not cropped around the object.

Cite

Text

Monnier et al. "Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19769-7_17

Markdown

[Monnier et al. "Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/monnier2022eccv-share/) doi:10.1007/978-3-031-19769-7_17

BibTeX

@inproceedings{monnier2022eccv-share,
  title     = {{Share with Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency}},
  author    = {Monnier, Tom and Fisher, Matthew and Efros, Alexei A. and Aubry, Mathieu},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19769-7_17},
  url       = {https://mlanthology.org/eccv/2022/monnier2022eccv-share/}
}