Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation

Abstract

Recently, text-to-image diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing flexible image translation via user-provided text prompts. This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework contributing a novel solution to text-guided I2I from a frequency-domain perspective. At the heart of our framework is a feature-space frequency-domain filtering module based on Discrete Cosine Transform, which extracts image features carrying different DCT spectral bands to control the text-to-image generation process of the Latent Diffusion Model, realizing versatile I2I applications including style-guided content creation, image semantic manipulation, image scene translation, and image style translation. Different from related methods, FCDiffusion establishes a unified text-driven I2I framework suiting diverse I2I application scenarios simply by switching among different frequency control branches. The effectiveness and superiority of our method for text-guided I2I are demonstrated with extensive experiments both qualitatively and quantitatively. Our project is publicly available at: https://xianggao1102.github.io/FCDiffusion/.

Cite

Text

Gao et al. "Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I3.27951

Markdown

[Gao et al. "Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/gao2024aaai-frequency/) doi:10.1609/AAAI.V38I3.27951

BibTeX

@inproceedings{gao2024aaai-frequency,
  title     = {{Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation}},
  author    = {Gao, Xiang and Xu, Zhengbo and Zhao, Junhan and Liu, Jiaying},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {1824-1832},
  doi       = {10.1609/AAAI.V38I3.27951},
  url       = {https://mlanthology.org/aaai/2024/gao2024aaai-frequency/}
}