Real-Time Indoor Scene Understanding Using Bayesian Filtering with Motion Cues

Abstract

We present a method whereby an embodied agent using visual perception can efficiently create a model of a local indoor environment from its experience of moving within it. Our method uses motion cues to compute likelihoods of indoor structure hypotheses, based on simple, generic geometric knowledge about points, lines, planes, and motion. We present a single-image analysis, not to attempt to identify a single accurate model, but to propose a set of plausible hypotheses about the structure of the environment from an initial frame. We then use data from subsequent frames to update a Bayesian posterior probability distribution over the set of hypotheses. The likelihood function is efficiently computable by comparing the predicted location of point features on the environment model to their actual tracked locations in the image stream. Our method runs in real-time, and it avoids the need of extensive prior training and the Manhattan-world assumption, which makes it more practical and efficient for an intelligent robot to understand its surroundings compared to most previous scene understanding methods. Experimental results on a collection of indoor videos suggest that our method is capable of an unprecedented combination of accuracy and efficiency.

Cite

Text

Tsai et al. "Real-Time Indoor Scene Understanding Using Bayesian Filtering with Motion Cues." IEEE/CVF International Conference on Computer Vision, 2011. doi:10.1109/ICCV.2011.6126233

Markdown

[Tsai et al. "Real-Time Indoor Scene Understanding Using Bayesian Filtering with Motion Cues." IEEE/CVF International Conference on Computer Vision, 2011.](https://mlanthology.org/iccv/2011/tsai2011iccv-real/) doi:10.1109/ICCV.2011.6126233

BibTeX

@inproceedings{tsai2011iccv-real,
  title     = {{Real-Time Indoor Scene Understanding Using Bayesian Filtering with Motion Cues}},
  author    = {Tsai, Grace and Xu, Changhai and Liu, Jingen and Kuipers, Benjamin},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {2011},
  pages     = {121-128},
  doi       = {10.1109/ICCV.2011.6126233},
  url       = {https://mlanthology.org/iccv/2011/tsai2011iccv-real/}
}