A Scalable Tree-Based Approach for Joint Object and Pose Recognition

Abstract

Recognizing possibly thousands of objects is a crucial capability for an autonomous agent to understand and interact with everyday environments. Practical object recognition comes in multiple forms: Is this a coffee mug (category recognition). Is this Alice's coffee mug? (instance recognition). Is the mug with the handle facing left or right? (pose recognition). We present a scalable framework, Object-Pose Tree, which efficiently organizes data into a semantically structured tree. The tree structure enables both scalable training and testing, allowing us to solve recognition over thousands of object poses in near real-time. Moreover, by simultaneously optimizing all three tasks, our approach outperforms standard nearest neighbor and 1-vs-all classifications, with large improvements on pose recognition. We evaluate the proposed technique on a dataset of 300 household objects collected using a Kinect-style 3D camera. Experiments demonstrate that our system achieves robust and efficient object category, instance, and pose recognition on challenging everyday objects.

Cite

Text

Lai et al. "A Scalable Tree-Based Approach for Joint Object and Pose Recognition." AAAI Conference on Artificial Intelligence, 2011. doi:10.1609/AAAI.V25I1.7986

Markdown

[Lai et al. "A Scalable Tree-Based Approach for Joint Object and Pose Recognition." AAAI Conference on Artificial Intelligence, 2011.](https://mlanthology.org/aaai/2011/lai2011aaai-scalable/) doi:10.1609/AAAI.V25I1.7986

BibTeX

@inproceedings{lai2011aaai-scalable,
  title     = {{A Scalable Tree-Based Approach for Joint Object and Pose Recognition}},
  author    = {Lai, Kevin and Bo, Liefeng and Ren, Xiaofeng and Fox, Dieter},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2011},
  pages     = {1474-1480},
  doi       = {10.1609/AAAI.V25I1.7986},
  url       = {https://mlanthology.org/aaai/2011/lai2011aaai-scalable/}
}