Semantic Pixel Distances for Image Editing

Abstract

Many image editing techniques make processing decisions based on measures of similarity between pairs of pixels. Traditionally, pixel similarity is measured using a simple L2 distance on RGB or luminance values. In this work, we explore a richer notion of similarity based on feature embeddings learned by convolutional neural networks. We propose to measure pixel similarity by combining distance in a semantically-meaningful feature embedding with traditional color difference. Using semantic features from the penultimate layer of an off-the-shelf semantic segmentation model, we evaluate our distance measure in two image editing applications. A user study shows that incorporating semantic distances into content-aware resizing via seam carving [2] produces improved results. Off-the-shelf semantic features are found to have mixed effectiveness in content-based range masking, suggesting that training better general-purpose pixel embeddings presents a promising future direction for creating semantically-meaningful feature spaces that can be used in a variety of applications.

Cite

Text

Myers-Dean and Wehrwein. "Semantic Pixel Distances for Image Editing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00275

Markdown

[Myers-Dean and Wehrwein. "Semantic Pixel Distances for Image Editing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/myersdean2020cvprw-semantic/) doi:10.1109/CVPRW50498.2020.00275

BibTeX

@inproceedings{myersdean2020cvprw-semantic,
  title     = {{Semantic Pixel Distances for Image Editing}},
  author    = {Myers-Dean, Josh and Wehrwein, Scott},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2020},
  pages     = {2267-2274},
  doi       = {10.1109/CVPRW50498.2020.00275},
  url       = {https://mlanthology.org/cvprw/2020/myersdean2020cvprw-semantic/}
}