Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation
Abstract
Recently, text-to-image diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing flexible image translation via user-provided text prompts. This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework contributing a novel solution to text-guided I2I from a frequency-domain perspective. At the heart of our framework is a feature-space frequency-domain filtering module based on Discrete Cosine Transform, which extracts image features carrying different DCT spectral bands to control the text-to-image generation process of the Latent Diffusion Model, realizing versatile I2I applications including style-guided content creation, image semantic manipulation, image scene translation, and image style translation. Different from related methods, FCDiffusion establishes a unified text-driven I2I framework suiting diverse I2I application scenarios simply by switching among different frequency control branches. The effectiveness and superiority of our method for text-guided I2I are demonstrated with extensive experiments both qualitatively and quantitatively. Our project is publicly available at: https://xianggao1102.github.io/FCDiffusion/.
Cite
Text
Gao et al. "Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I3.27951Markdown
[Gao et al. "Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/gao2024aaai-frequency/) doi:10.1609/AAAI.V38I3.27951BibTeX
@inproceedings{gao2024aaai-frequency,
title = {{Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation}},
author = {Gao, Xiang and Xu, Zhengbo and Zhao, Junhan and Liu, Jiaying},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {1824-1832},
doi = {10.1609/AAAI.V38I3.27951},
url = {https://mlanthology.org/aaai/2024/gao2024aaai-frequency/}
}