You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

Abstract

Two primary input modalities prevail in image retrieval: sketch and text. While text is widely used for inter-category retrieval tasks sketches have been established as the sole preferred modality for fine-grained image retrieval due to their ability to capture intricate visual details. In this paper we question the reliance on sketches alone for fine-grained image retrieval by simultaneously exploring the fine-grained representation capabilities of both sketch and text orchestrating a duet between the two. The end result enables precise retrievals previously unattainable allowing users to pose ever-finer queries and incorporate attributes like colour and contextual cues from text. For this purpose we introduce a novel compositionality framework effectively combining sketches and text using pre-trained CLIP models while eliminating the need for extensive fine-grained textual descriptions. Last but not least our system extends to novel applications in composed image retrieval domain attribute transfer and fine-grained generation providing solutions for various real-world scenarios.

Cite

Text

Koley et al. "You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval." Conference on Computer Vision and Pattern Recognition, 2024.

Markdown

[Koley et al. "You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/koley2024cvpr-you/)

BibTeX

@inproceedings{koley2024cvpr-you,
  title     = {{You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval}},
  author    = {Koley, Subhadeep and Bhunia, Ayan Kumar and Sain, Aneeshan and Chowdhury, Pinaki Nath and Xiang, Tao and Song, Yi-Zhe},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {16509-16519},
  url       = {https://mlanthology.org/cvpr/2024/koley2024cvpr-you/}
}