Learning Dense Correspondences Between Photos and Sketches

Abstract

Humans effortlessly grasp the connection between sketches and real-world objects, even when these sketches are far from realistic. Moreover, human sketch understanding goes beyond categorization – critically, it also entails understanding how individual elements within a sketch correspond to parts of the physical world it represents. What are the computational ingredients needed to support this ability? Towards answering this question, we make two contributions: first, we introduce a new sketch-photo correspondence benchmark, PSC6k, containing 150K annotations of 6250 sketch-photo pairs across 125 object categories, augmenting the existing Sketchy dataset with fine-grained correspondence metadata. Second, we propose a self-supervised method for learning dense correspondences between sketch-photo pairs, building upon recent advances in correspondence learning for pairs of photos. Our model uses a spatial transformer network to estimate the warp flow between latent representations of a sketch and photo extracted by a contrastive learning-based ConvNet backbone. We found that this approach outperformed several strong baselines and produced predictions that were quantitatively consistent with other warp-based methods. However, our benchmark also revealed systematic differences between predictions of the suite of models we tested and those of humans. Taken together, our work suggests a promising path towards developing artificial systems that achieve more human-like understanding of visual images at different levels of abstraction. Project page: https://photo-sketch-correspondence.github.io

Cite

Text

Lu et al. "Learning Dense Correspondences Between Photos and Sketches." International Conference on Machine Learning, 2023.

Markdown

[Lu et al. "Learning Dense Correspondences Between Photos and Sketches." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/lu2023icml-learning/)

BibTeX

@inproceedings{lu2023icml-learning,
  title     = {{Learning Dense Correspondences Between Photos and Sketches}},
  author    = {Lu, Xuanchen and Wang, Xiaolong and Fan, Judith E},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {22899-22916},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/lu2023icml-learning/}
}