ElasticDiffusion: Training-Free Arbitrary Size Image Generation Through Global-Local Content Separation

Moayed Haji-Ali, Guha Balakrishnan, Vicente Ordonez

CVPR 2024 pp. 6603-6612

doi:10.1109/CVPR52733.2024.00631 /cvpr/2024/hajiali2024cvpr-elasticdiffusion/

Abstract

Diffusion models have revolutionized image generation in recent years yet they are still limited to a few sizes and aspect ratios. We propose ElasticDiffusion a novel training-free decoding method that enables pretrained text-to-image diffusion models to generate images with various sizes. ElasticDiffusion attempts to decouple the generation trajectory of a pretrained model into local and global signals. The local signal controls low-level pixel information and can be estimated on local patches while the global signal is used to maintain overall structural consistency and is estimated with a reference image. We test our method on CelebA-HQ (faces) and LAION-COCO (objects/indoor/outdoor scenes). Our experiments and qualitative results show superior image coherence quality across aspect ratios compared to MultiDiffusion and the standard decoding strategy of Stable Diffusion. Project Webpage: https://elasticdiffusion.github.io

PDF CVPR Semantic Scholar

Cite

Text

Haji-Ali et al. "ElasticDiffusion: Training-Free Arbitrary Size Image Generation Through Global-Local Content Separation." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.00631

Markdown

[Haji-Ali et al. "ElasticDiffusion: Training-Free Arbitrary Size Image Generation Through Global-Local Content Separation." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/hajiali2024cvpr-elasticdiffusion/) doi:10.1109/CVPR52733.2024.00631

BibTeX

@inproceedings{hajiali2024cvpr-elasticdiffusion,
  title     = {{ElasticDiffusion: Training-Free Arbitrary Size Image Generation Through Global-Local Content Separation}},
  author    = {Haji-Ali, Moayed and Balakrishnan, Guha and Ordonez, Vicente},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {6603-6612},
  doi       = {10.1109/CVPR52733.2024.00631},
  url       = {https://mlanthology.org/cvpr/2024/hajiali2024cvpr-elasticdiffusion/}
}