Composed Query Image Retrieval Using Locally Bounded Features

Abstract

Composed query image retrieval is a new problem where the query consists of an image together with a requested modification expressed via a textual sentence. The goal is then to retrieve the images that are generally similar to the query image, but differ according to the requested modification. Previous methods usually consider the image as a whole. In this paper, we propose a novel method that represents the image using a set of local areas in the image. The relationship between each word in the modification text and each area in the image is then explicitly established, allowing the model to accurately correlate the modification text to parts of the image. We conduct extensive experiments on three benchmark datasets. The results show that our method outperforms other state-of-the-art approaches by a considerable margin.

Cite

Text

Hosseinzadeh and Wang. "Composed Query Image Retrieval Using Locally Bounded Features." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.00365

Markdown

[Hosseinzadeh and Wang. "Composed Query Image Retrieval Using Locally Bounded Features." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/hosseinzadeh2020cvpr-composed/) doi:10.1109/CVPR42600.2020.00365

BibTeX

@inproceedings{hosseinzadeh2020cvpr-composed,
  title     = {{Composed Query Image Retrieval Using Locally Bounded Features}},
  author    = {Hosseinzadeh, Mehrdad and Wang, Yang},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.00365},
  url       = {https://mlanthology.org/cvpr/2020/hosseinzadeh2020cvpr-composed/}
}