Shift-Tolerant Perceptual Similarity Metric

Abstract

Existing perceptual similarity metrics assume an image and its reference are well aligned. As a result, these metrics are often sensitive to a small alignment error that is imperceptible to the human eyes. This paper studies the effect of small misalignment, specifically a small shift between the input and reference image, on existing metrics, and accordingly develops a shift-tolerant similarity metric. This paper builds upon LPIPS, a widely used learned perceptual similarity metric, and explores architectural design considerations to make it robust against imperceptible misalignment. Specifically, we study a wide spectrum of neural network elements, such as anti-aliasing filtering, pooling, striding, padding, and skip connection, and discuss their roles in making a robust metric. Based on our studies, we develop a new deep neural network-based perceptual similarity metric. Our experiments show that our metric is tolerant to imperceptible shifts while being consistent with the human similarity judgment. Code is available at https://tinyurl.com/5n85r28r.

Cite

Text

Ghildyal and Liu. "Shift-Tolerant Perceptual Similarity Metric." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19797-0_6

Markdown

[Ghildyal and Liu. "Shift-Tolerant Perceptual Similarity Metric." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/ghildyal2022eccv-shifttolerant/) doi:10.1007/978-3-031-19797-0_6

BibTeX

@inproceedings{ghildyal2022eccv-shifttolerant,
  title     = {{Shift-Tolerant Perceptual Similarity Metric}},
  author    = {Ghildyal, Abhijay and Liu, Feng},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19797-0_6},
  url       = {https://mlanthology.org/eccv/2022/ghildyal2022eccv-shifttolerant/}
}