Shift-Tolerant Perceptual Similarity Metric
Abstract
Existing perceptual similarity metrics assume an image and its reference are well aligned. As a result, these metrics are often sensitive to a small alignment error that is imperceptible to the human eyes. This paper studies the effect of small misalignment, specifically a small shift between the input and reference image, on existing metrics, and accordingly develops a shift-tolerant similarity metric. This paper builds upon LPIPS, a widely used learned perceptual similarity metric, and explores architectural design considerations to make it robust against imperceptible misalignment. Specifically, we study a wide spectrum of neural network elements, such as anti-aliasing filtering, pooling, striding, padding, and skip connection, and discuss their roles in making a robust metric. Based on our studies, we develop a new deep neural network-based perceptual similarity metric. Our experiments show that our metric is tolerant to imperceptible shifts while being consistent with the human similarity judgment. Code is available at https://tinyurl.com/5n85r28r.
Cite
Text
Ghildyal and Liu. "Shift-Tolerant Perceptual Similarity Metric." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19797-0_6Markdown
[Ghildyal and Liu. "Shift-Tolerant Perceptual Similarity Metric." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/ghildyal2022eccv-shifttolerant/) doi:10.1007/978-3-031-19797-0_6BibTeX
@inproceedings{ghildyal2022eccv-shifttolerant,
title = {{Shift-Tolerant Perceptual Similarity Metric}},
author = {Ghildyal, Abhijay and Liu, Feng},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19797-0_6},
url = {https://mlanthology.org/eccv/2022/ghildyal2022eccv-shifttolerant/}
}