A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva

Abstract

This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7,736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.

Cite

Text

Zhang et al. "A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva." IEEE/CVF International Conference on Computer Vision, 2005. doi:10.1109/ICCV.2005.16

Markdown

[Zhang et al. "A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva." IEEE/CVF International Conference on Computer Vision, 2005.](https://mlanthology.org/iccv/2005/zhang2005iccv-probabilistic/) doi:10.1109/ICCV.2005.16

BibTeX

@inproceedings{zhang2005iccv-probabilistic,
  title     = {{A Probabilistic Semantic Model for Image Annotation and Multi-Modal Image Retrieva}},
  author    = {Zhang, Ruofei and Zhang, Zhongfei (Mark) and Li, Mingjing and Ma, Wei-Ying and Zhang, HongJiang},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {2005},
  pages     = {846-851},
  doi       = {10.1109/ICCV.2005.16},
  url       = {https://mlanthology.org/iccv/2005/zhang2005iccv-probabilistic/}
}