Spatial-Content Image Search in Complex Scenes
Abstract
Although the topic of image search has been heavily studied in the last two decades, many works have focused on either instance-level retrieval or semantic-level retrieval. In this work, we develop a novel visually similar spatial-semantic method, namely spatial-content image search, to search images that not only share the same spatial-semantics but also enjoy visual consistency as the query image in complex scenes. We achieve the goal by capturing spatial-semantic concepts as well as the visual representation of each concept contained in an image. Specifically, we first generate a set of bounding boxes and their category labels representing spatial-semantic constraints with YOLOV3, and then obtain visual content of each bounding box with deep features extracted from a convolutional neural network. After that, we customize a similarity computation method that evaluates the relevance between dataset images and input queries according to the developed image representations. Experimental results on two large-scale benchmark retrieval datasets with images consisting of multiple objects demonstrate that our method provides an effective way to query image databases. Our code is available at https://github.com/MaJinWakeUp/spatial-content.
Cite
Text
Ma et al. "Spatial-Content Image Search in Complex Scenes." Winter Conference on Applications of Computer Vision, 2020.Markdown
[Ma et al. "Spatial-Content Image Search in Complex Scenes." Winter Conference on Applications of Computer Vision, 2020.](https://mlanthology.org/wacv/2020/ma2020wacv-spatialcontent/)BibTeX
@inproceedings{ma2020wacv-spatialcontent,
title = {{Spatial-Content Image Search in Complex Scenes}},
author = {Ma, Jin and Pang, Shanmin and Yang, Bo and Zhu, Jihua and Li, Yaochen},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2020},
url = {https://mlanthology.org/wacv/2020/ma2020wacv-spatialcontent/}
}