Learning a Predictable and Generative Vector Representation for Objects

Abstract

What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.

Cite

Text

Girdhar et al. "Learning a Predictable and Generative Vector Representation for Objects." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46466-4_29

Markdown

[Girdhar et al. "Learning a Predictable and Generative Vector Representation for Objects." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/girdhar2016eccv-learning/) doi:10.1007/978-3-319-46466-4_29

BibTeX

@inproceedings{girdhar2016eccv-learning,
  title     = {{Learning a Predictable and Generative Vector Representation for Objects}},
  author    = {Girdhar, Rohit and Fouhey, David F. and Rodriguez, Mikel and Gupta, Abhinav},
  booktitle = {European Conference on Computer Vision},
  year      = {2016},
  pages     = {484-499},
  doi       = {10.1007/978-3-319-46466-4_29},
  url       = {https://mlanthology.org/eccv/2016/girdhar2016eccv-learning/}
}