Semantic Pixel Distances for Image Editing
Abstract
Many image editing techniques make processing decisions based on measures of similarity between pairs of pixels. Traditionally, pixel similarity is measured using a simple L2 distance on RGB or luminance values. In this work, we explore a richer notion of similarity based on feature embeddings learned by convolutional neural networks. We propose to measure pixel similarity by combining distance in a semantically-meaningful feature embedding with traditional color difference. Using semantic features from the penultimate layer of an off-the-shelf semantic segmentation model, we evaluate our distance measure in two image editing applications. A user study shows that incorporating semantic distances into content-aware resizing via seam carving [2] produces improved results. Off-the-shelf semantic features are found to have mixed effectiveness in content-based range masking, suggesting that training better general-purpose pixel embeddings presents a promising future direction for creating semantically-meaningful feature spaces that can be used in a variety of applications.
Cite
Text
Myers-Dean and Wehrwein. "Semantic Pixel Distances for Image Editing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00275Markdown
[Myers-Dean and Wehrwein. "Semantic Pixel Distances for Image Editing." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/myersdean2020cvprw-semantic/) doi:10.1109/CVPRW50498.2020.00275BibTeX
@inproceedings{myersdean2020cvprw-semantic,
title = {{Semantic Pixel Distances for Image Editing}},
author = {Myers-Dean, Josh and Wehrwein, Scott},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2020},
pages = {2267-2274},
doi = {10.1109/CVPRW50498.2020.00275},
url = {https://mlanthology.org/cvprw/2020/myersdean2020cvprw-semantic/}
}