UVCGAN: UNet Vision Transformer Cycle-Consistent GAN for Unpaired Image-to-Image Translation
Abstract
Unpaired image-to-image translation has broad applications in art, design, and scientific simulations. One early breakthrough was CycleGAN that emphasizes one-to-one mappings between two unpaired image domains via generative-adversarial networks (GAN) coupled with the cycle-consistency constraint, while more recent works promote one-to-many mapping to boost diversity of the translated images. Motivated by scientific simulation and one-to-one needs, this work revisits the classic CycleGAN framework and boosts its performance to outperform more contemporary models without relaxing the cycle-consistency constraint. To achieve this, we equip the generator with a Vision Transformer (ViT) and employ necessary training and regularization techniques. Compared to previous best-performing models, our model performs better and retains a strong correlation between the original and translated image. An accompanying ablation study shows that both the gradient penalty and self-supervised pre-training are crucial to the improvement. To promote reproducibility and open science, the source code, hyperparameter configurations, and pre-trained model are available at https: //github.com/LS4GAN/uvcgan.
Cite
Text
Torbunov et al. "UVCGAN: UNet Vision Transformer Cycle-Consistent GAN for Unpaired Image-to-Image Translation." Winter Conference on Applications of Computer Vision, 2023.Markdown
[Torbunov et al. "UVCGAN: UNet Vision Transformer Cycle-Consistent GAN for Unpaired Image-to-Image Translation." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/torbunov2023wacv-uvcgan/)BibTeX
@inproceedings{torbunov2023wacv-uvcgan,
title = {{UVCGAN: UNet Vision Transformer Cycle-Consistent GAN for Unpaired Image-to-Image Translation}},
author = {Torbunov, Dmitrii and Huang, Yi and Yu, Haiwang and Huang, Jin and Yoo, Shinjae and Lin, Meifeng and Viren, Brett and Ren, Yihui},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2023},
pages = {702-712},
url = {https://mlanthology.org/wacv/2023/torbunov2023wacv-uvcgan/}
}