Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval
Abstract
The key to Fine-Grained Sketch Based Image Retrieval (FG-SBIR) is to establish fine-grained correspondence between sketches and images. Since sketches only consist of abstract strokes, stroke recognition ability plays an important role in FG-SBIR. However, existing works usually ignore the unique feature of sketches and treat images and sketches equally. Targeting at this problem, we propose Conditional Stroke Recovery (CSR) to enhance stroke recognition ability for FG-SBIR, in which we introduce an auxiliary task that requires the network recover the strokes using the paired image as condition. In this way, the network learns better to match the strokes with corresponding image elements. To complete the auxiliary task, we propose an unsupervised stroke disorder algorithm, which does well in stroke extraction and sketch augmentation. In addition, we figure out two weaknesses of the common triplet loss and propose double-anchor InfoNCE loss to reduce cosine distances between sketch-image pairs. Comprehensive experiments using various backbones are conducted on four datasets (i.e., QMUL-Shoe, QMUL-Chair, QMUL-ShoeV2, and Sketchy). In terms of acc@1, our method outperforms previous works by a great margin.
Cite
Text
Ling et al. "Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19809-0_41Markdown
[Ling et al. "Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/ling2022eccv-conditional/) doi:10.1007/978-3-031-19809-0_41BibTeX
@inproceedings{ling2022eccv-conditional,
title = {{Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval}},
author = {Ling, Zhixin and Xing, Zhen and Zhou, Jian and Zhou, Xiangdong},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19809-0_41},
url = {https://mlanthology.org/eccv/2022/ling2022eccv-conditional/}
}