Learning the Compositional Nature of Visual Objects

Abstract

The compositional nature of visual objects significantly limits their representation complexity and renders learning of structured object models tractable. Adopting this modeling strategy we both (i) automatically decompose objects into a hierarchy of relevant compositions and we (ii) learn such a compositional representation for each category without supervision. The compositional structure supports feature sharing already on the lowest level of small image patches. Compositions are represented as probability distributions over their constituent parts and the relations between them. The global shape of objects is captured by a graphical model which combines all compositions. Inference based on the underlying statistical model is then employed to obtain a category level object recognition system. Experiments on large standard benchmark datasets underline the competitive recognition performance of this approach and they provide insights into the learned compositional structure of objects.

Cite

Text

Ommer and Buhmann. "Learning the Compositional Nature of Visual Objects." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383154

Markdown

[Ommer and Buhmann. "Learning the Compositional Nature of Visual Objects." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/ommer2007cvpr-learning/) doi:10.1109/CVPR.2007.383154

BibTeX

@inproceedings{ommer2007cvpr-learning,
  title     = {{Learning the Compositional Nature of Visual Objects}},
  author    = {Ommer, Björn and Buhmann, Joachim M.},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2007},
  doi       = {10.1109/CVPR.2007.383154},
  url       = {https://mlanthology.org/cvpr/2007/ommer2007cvpr-learning/}
}