The Spatially-Correlative Loss for Various Image Translation Tasks
Abstract
We propose a novel spatially-correlative loss that is simple, efficient, and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation. Previous methods attempt this by using pixel-level cycle-consistency or feature-level matching losses, but the domain-specific nature of these losses hinder translation across large domain gaps. To address this, we exploit the spatial patterns of self-similarity as a means of defining scene structure. Our spatially-correlative loss is geared towards only capturing spatial relationships within an image rather than domain appearance. We also introduce a new self-supervised learning method to explicitly learn spatially-correlative maps for each specific translation task. We show distinct improvement over baseline models in all three modes of unpaired I2I translation: single-modal, multi-modal, and even single-image translation. This new loss can easily be integrated into existing network architectures and thus allows wide applicability.
Cite
Text
Zheng et al. "The Spatially-Correlative Loss for Various Image Translation Tasks." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01614Markdown
[Zheng et al. "The Spatially-Correlative Loss for Various Image Translation Tasks." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/zheng2021cvpr-spatiallycorrelative/) doi:10.1109/CVPR46437.2021.01614BibTeX
@inproceedings{zheng2021cvpr-spatiallycorrelative,
title = {{The Spatially-Correlative Loss for Various Image Translation Tasks}},
author = {Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2021},
pages = {16407-16417},
doi = {10.1109/CVPR46437.2021.01614},
url = {https://mlanthology.org/cvpr/2021/zheng2021cvpr-spatiallycorrelative/}
}