E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

Abstract

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models to generate paired datasets used for training generative adversarial networks (GANs). This approach notably alleviates the stringent requirements typically imposed by high-end commercial GPUs for performing image editing with diffusion models. However, unlike text-to-image diffusion models, each distilled GAN is specialized for a specific image editing task, necessitating costly training efforts to obtain models for various concepts. In this work, we introduce and address a novel research direction: can the process of distilling GANs from diffusion models be made significantly more efficient? To achieve this goal, we propose a series of innovative techniques. First, we construct a base GAN model with generalized features, adaptable to different concepts through fine-tuning, eliminating the need for training from scratch. Second, we identify crucial layers within the base GAN model and employ Low-Rank Adaptation (LoRA) with a simple yet effective rank search process, rather than fine-tuning the entire base model. Third, we investigate the minimal amount of data necessary for fine-tuning, further reducing the overall training time. Extensive experiments show that we can efficiently empower GANs with the ability to perform real-time high-quality image editing on mobile devices with remarkably reduced training and storage costs for each concept.

Cite

Text

Gong et al. "E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation." International Conference on Machine Learning, 2024.

Markdown

[Gong et al. "E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/gong2024icml-2gan/)

BibTeX

@inproceedings{gong2024icml-2gan,
  title     = {{E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation}},
  author    = {Gong, Yifan and Zhan, Zheng and Jin, Qing and Li, Yanyu and Idelbayev, Yerlan and Liu, Xian and Zharkov, Andrey and Aberman, Kfir and Tulyakov, Sergey and Wang, Yanzhi and Ren, Jian},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {15929-15950},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/gong2024icml-2gan/}
}