Hierarchy-Based Image Embeddings for Semantic Image Retrieval

Abstract

Deep neural networks trained for classification have been found to learn powerful image representations, which are also often used for other tasks such as comparing images w.r.t. their visual similarity. However, visual similarity does not imply semantic similarity. In order to learn semantically discriminative features, we propose to map images onto class embeddings whose pair-wise dot products correspond to a measure of semantic similarity between classes. Such an embedding does not only improve image retrieval results, but could also facilitate integrating semantics for other tasks, e.g., novelty detection or few-shot learning. We introduce a deterministic algorithm for computing the class centroids directly based on prior world-knowledge encoded in a hierarchy of classes such as WordNet. Experiments on CIFAR-100, NABirds, and ImageNet show that our learned semantic image embeddings improve the semantic consistency of image retrieval results by a large margin.

Cite

Text

Barz and Denzler. "Hierarchy-Based Image Embeddings for Semantic Image Retrieval." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00073

Markdown

[Barz and Denzler. "Hierarchy-Based Image Embeddings for Semantic Image Retrieval." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/barz2019wacv-hierarchy/) doi:10.1109/WACV.2019.00073

BibTeX

@inproceedings{barz2019wacv-hierarchy,
  title     = {{Hierarchy-Based Image Embeddings for Semantic Image Retrieval}},
  author    = {Barz, Björn and Denzler, Joachim},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2019},
  pages     = {638-647},
  doi       = {10.1109/WACV.2019.00073},
  url       = {https://mlanthology.org/wacv/2019/barz2019wacv-hierarchy/}
}