Coherent Image Annotation by Learning Semantic Distance

Abstract

Conventional approaches to automatic image annotation usually suffer from two problems: (1) They cannot guarantee a good semantic coherence of the annotated words for each image, as they treat each word independently without considering the inherent semantic coherence among the words; (2) They heavily rely on visual similarity for judging semantic similarity. To address the above issues, we propose a novel approach to image annotation which simultaneously learns a semantic distance by capturing the prior annotation knowledge and propagates the annotation of an image as a whole entity. Specifically, a semantic distance function (SDF) is learned for each semantic cluster to measure the semantic similarity based on relative comparison relations of prior annotations. To annotate a new image, the training images in each cluster are ranked according to their SDF values with respect to this image and their corresponding annotations are then propagated to this image as a whole entity to ensure semantic coherence. We evaluate the innovative SDF-based approach on Corel images compared with Support Vector Machine-based approach. The experiments show that SDF-based approach outperforms in terms of semantic coherence, especially when each training image is associated with multiple words.

Cite

Text

Mei et al. "Coherent Image Annotation by Learning Semantic Distance." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008. doi:10.1109/CVPR.2008.4587386

Markdown

[Mei et al. "Coherent Image Annotation by Learning Semantic Distance." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2008.](https://mlanthology.org/cvpr/2008/mei2008cvpr-coherent/) doi:10.1109/CVPR.2008.4587386

BibTeX

@inproceedings{mei2008cvpr-coherent,
  title     = {{Coherent Image Annotation by Learning Semantic Distance}},
  author    = {Mei, Tao and Wang, Yong and Hua, Xian-Sheng and Gong, Shaogang and Li, Shipeng},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2008},
  doi       = {10.1109/CVPR.2008.4587386},
  url       = {https://mlanthology.org/cvpr/2008/mei2008cvpr-coherent/}
}