Adversarial Diffusion Distillation

Abstract

We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently samples large-scale foundational image diffusion models in just 1–4 steps while maintaining high image quality. We use score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal in combination with an adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps. Our analyses show that our model clearly outperforms existing few-step methods (GANs, Latent Consistency Models) in a single step and reaches the performance of state-of-the-art diffusion models (SDXL) in only four steps. ADD is the first method to unlock single-step, real-time image synthesis with foundation models.

Cite

Text

Sauer et al. "Adversarial Diffusion Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73016-0_6

Markdown

[Sauer et al. "Adversarial Diffusion Distillation." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/sauer2024eccv-adversarial/) doi:10.1007/978-3-031-73016-0_6

BibTeX

@inproceedings{sauer2024eccv-adversarial,
  title     = {{Adversarial Diffusion Distillation}},
  author    = {Sauer, Axel and Lorenz, Dominik and Blattmann, Andreas and Rombach, Robin},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73016-0_6},
  url       = {https://mlanthology.org/eccv/2024/sauer2024eccv-adversarial/}
}