ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories

Abstract

Text-to-image diffusion models excel in generating photo-realistic images but are hampered by slow processing times. Training-free retrieval-based acceleration methods, which leverage pre-generated “trajectories,” have been introduced to address this. Yet, these methods often lack diversity and fidelity as they depend heavily on similarities to stored prompts. To address this, we present (Retrieving Concepts), an innovative retrieval-based diffusion acceleration method that extracts visual “concepts” from prompts, forming a knowledge base that facilitates the creation of adaptable trajectories. Consequently, surpasses existing retrieval-based methods, producing high-fidelity images and reducing required Neural Function Evaluations (NFEs) by up to 40%. Extensive testing on MS-COCO, Pick-a-pick, and DiffusionDB datasets confirms that consistently outperforms established methods across multiple metrics such as Pick Score, CLIP Score, and Aesthetics Score. A user study further indicates that 76% of images generated by are rated as the highest fidelity, outperforming two competing methods, a purely text-based retrieval and a noise similarity-based retrieval. Project URL: https://stevencylu.github.io/ReCon.

Cite

Text

Lu et al. "ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73202-7_17

Markdown

[Lu et al. "ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/lu2024eccv-recon/) doi:10.1007/978-3-031-73202-7_17

BibTeX

@inproceedings{lu2024eccv-recon,
  title     = {{ReCON: Training-Free Acceleration for Text-to-Image Synthesis with Retrieval of Concept Prompt Trajectories}},
  author    = {Lu, Chen-Yi and Agarwal, Shubham and Tanjim, Md Mehrab and Mahadik, Kanak and Rao, Anup and Mitra, Subrata and Saini, Shiv K and Bagchi, Saurabh and Chaterji, Somali},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73202-7_17},
  url       = {https://mlanthology.org/eccv/2024/lu2024eccv-recon/}
}