Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval

Abstract

ImageNet pre-training has long been considered crucial by the fine-grained sketch-based image retrieval (FG-SBIR) community due to the lack of large sketch-photo paired datasets for FG-SBIR training. In this paper, we propose a self-supervised alternative for representation pre-training. Specifically, we consider the jigsaw puzzle game of recomposing images from shuffled parts. We identify two key facets of jigsaw task design that are required for effective FG-SBIR pre-training. The first is formulating the puzzle in a mixed-modality fashion. Second we show that framing the optimisation as permutation matrix inference via Sinkhorn iterations is more effective than the common classifier formulation of Jigsaw self-supervision. Experiments show that this self-supervised pre-training strategy significantly outperforms the standard ImageNet-based pipeline across all four product-level FG-SBIR benchmarks. Interestingly it also leads to improved cross-category generalisation across both pre-train/fine-tune and fine-tune/testing stages.

Cite

Text

Pang et al. "Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.01036

Markdown

[Pang et al. "Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/pang2020cvpr-solving/) doi:10.1109/CVPR42600.2020.01036

BibTeX

@inproceedings{pang2020cvpr-solving,
  title     = {{Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval}},
  author    = {Pang, Kaiyue and Yang, Yongxin and Hospedales, Timothy M. and Xiang, Tao and Song, Yi-Zhe},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.01036},
  url       = {https://mlanthology.org/cvpr/2020/pang2020cvpr-solving/}
}