More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

Abstract

A fundamental challenge faced by existing Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) models is the data scarcity -- model performances are largely bottlenecked by the lack of sketch-photo pairs. Whilst the number of photos can be easily scaled, each corresponding sketch still needs to be individually produced. In this paper, we aim to mitigate such an upper-bound on sketch data, and study whether unlabelled photos alone (of which they are many) can be cultivated for performance gain. In particular, we introduce a novel semi-supervised framework for cross-modal retrieval that can additionally leverage large-scale unlabelled photos to account for data scarcity. At the center of our semi-supervision design is a sequential photo-to-sketch generation model that aims to generate paired sketches for unlabelled photos. Importantly, we further introduce a discriminator-guided mechanism to guide against unfaithful generation, together with a distillation loss-based regularizer to provide tolerance against noisy training samples. Last but not least, we treat generation and retrieval as two conjugate problems, where a joint learning procedure is devised for each module to mutually benefit from each other. Extensive experiments show that our semi-supervised model yields a significant performance boost over the state-of-the-art supervised alternatives, as well as existing methods that can exploit unlabelled photos for FG-SBIR.

Cite

Text

Bhunia et al. "More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00423

Markdown

[Bhunia et al. "More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/bhunia2021cvpr-more/) doi:10.1109/CVPR46437.2021.00423

BibTeX

@inproceedings{bhunia2021cvpr-more,
  title     = {{More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval}},
  author    = {Bhunia, Ayan Kumar and Chowdhury, Pinaki Nath and Sain, Aneeshan and Yang, Yongxin and Xiang, Tao and Song, Yi-Zhe},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {4247-4256},
  doi       = {10.1109/CVPR46437.2021.00423},
  url       = {https://mlanthology.org/cvpr/2021/bhunia2021cvpr-more/}
}