Joint 3D Estimation of Objects and Scene Layout

Abstract

We propose a novel generative model that is able to reason jointly about the 3D scene layout as well as the 3D location and orientation of objects in the scene. In particular, we infer the scene topology, geometry as well as traffic activities from a short video sequence acquired with a single camera mounted on a moving car. Our generative model takes advantage of dynamic information in the form of vehicle tracklets as well as static information coming from semantic labels and geometry (i.e., vanishing points). Experiments show that our approach outperforms a discriminative baseline based on multiple kernel learning (MKL) which has access to the same image information. Furthermore, as we reason about objects in 3D, we are able to significantly increase the performance of state-of-the-art object detectors in their ability to estimate object orientation.

Cite

Text

Geiger et al. "Joint 3D Estimation of Objects and Scene Layout." Neural Information Processing Systems, 2011.

Markdown

[Geiger et al. "Joint 3D Estimation of Objects and Scene Layout." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/geiger2011neurips-joint/)

BibTeX

@inproceedings{geiger2011neurips-joint,
  title     = {{Joint 3D Estimation of Objects and Scene Layout}},
  author    = {Geiger, Andreas and Wojek, Christian and Urtasun, Raquel},
  booktitle = {Neural Information Processing Systems},
  year      = {2011},
  pages     = {1467-1475},
  url       = {https://mlanthology.org/neurips/2011/geiger2011neurips-joint/}
}