Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Abstract

Diffusion models (DMs) have shown great potential for high-quality image synthesis. However, when it comes to producing images with complex scenes, how to properly describe both image global structures and object details remains a challenging task. In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. Our model decomposes an input image into scale-dependent vector quantized features, followed by a coarse-to-fine gating for producing image output. During the above multi-scale representation learning stage, additional input conditions like text, scene graph, or image layout can be further exploited. Thus, Frido can be also applied for conditional or cross-modality image synthesis. We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and OpenImages, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO.

Cite

Text

Fan et al. "Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I1.25133

Markdown

[Fan et al. "Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/fan2023aaai-frido/) doi:10.1609/AAAI.V37I1.25133

BibTeX

@inproceedings{fan2023aaai-frido,
  title     = {{Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis}},
  author    = {Fan, Wan-Cyuan and Chen, Yen-Chun and Chen, Dongdong and Cheng, Yu and Yuan, Lu and Wang, Yu-Chiang Frank},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2023},
  pages     = {579-587},
  doi       = {10.1609/AAAI.V37I1.25133},
  url       = {https://mlanthology.org/aaai/2023/fan2023aaai-frido/}
}