Spatial-Content Image Search in Complex Scenes

Abstract

Although the topic of image search has been heavily studied in the last two decades, many works have focused on either instance-level retrieval or semantic-level retrieval. In this work, we develop a novel visually similar spatial-semantic method, namely spatial-content image search, to search images that not only share the same spatial-semantics but also enjoy visual consistency as the query image in complex scenes. We achieve the goal by capturing spatial-semantic concepts as well as the visual representation of each concept contained in an image. Specifically, we first generate a set of bounding boxes and their category labels representing spatial-semantic constraints with YOLOV3, and then obtain visual content of each bounding box with deep features extracted from a convolutional neural network. After that, we customize a similarity computation method that evaluates the relevance between dataset images and input queries according to the developed image representations. Experimental results on two large-scale benchmark retrieval datasets with images consisting of multiple objects demonstrate that our method provides an effective way to query image databases. Our code is available at https://github.com/MaJinWakeUp/spatial-content.

Cite

Text

Ma et al. "Spatial-Content Image Search in Complex Scenes." Winter Conference on Applications of Computer Vision, 2020.

Markdown

[Ma et al. "Spatial-Content Image Search in Complex Scenes." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/ma2020wacv-spatialcontent/)

BibTeX

@inproceedings{ma2020wacv-spatialcontent,
  title     = {{Spatial-Content Image Search in Complex Scenes}},
  author    = {Ma, Jin and Pang, Shanmin and Yang, Bo and Zhu, Jihua and Li, Yaochen},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2020},
  url       = {https://mlanthology.org/wacv/2020/ma2020wacv-spatialcontent/}
}