A Model for Learning the Semantics of Pictures

Abstract

We propose an approach to learning the semantics of images which al- lows us to automatically annotate an image with keywords and to retrieve images based on text queries. We do this using a formalism that models the generation of annotated images. We assume that every image is di- vided into regions, each described by a continuous-valued feature vector. Given a training set of images with annotations, we compute a joint prob- abilistic model of image features and words which allow us to predict the probability of generating a word given the image regions. This may be used to automatically annotate and retrieve images given a word as a query. Experiments show that our model significantly outperforms the best of the previously reported results on the tasks of automatic image annotation and retrieval.

Cite

Text

Lavrenko et al. "A Model for Learning the Semantics of Pictures." Neural Information Processing Systems, 2003.

Markdown

[Lavrenko et al. "A Model for Learning the Semantics of Pictures." Neural Information Processing Systems, 2003.](https://mlanthology.org/neurips/2003/lavrenko2003neurips-model/)

BibTeX

@inproceedings{lavrenko2003neurips-model,
  title     = {{A Model for Learning the Semantics of Pictures}},
  author    = {Lavrenko, Victor and Manmatha, R. and Jeon, Jiwoon},
  booktitle = {Neural Information Processing Systems},
  year      = {2003},
  pages     = {553-560},
  url       = {https://mlanthology.org/neurips/2003/lavrenko2003neurips-model/}
}