The Diffusion Duality
Abstract
In the context of language modeling, Uniform State discrete Diffusion Models (USDMs) hold the promise of faster generation due to their ability to self-correct. However, they are typically outperformed by Masked Diffusion Models (MDMs). In this work, we tighten the likelihood gap between USDMs and MDMs by exploiting a fundamental insight: Uniform state diffusion processes naturally emerge from an underlying Gaussian diffusion. Our method, DUO, transfers powerful techniques from Gaussian diffusion to improve both training and sampling. First, we introduce a curriculum learning strategy guided by the Gaussian process, doubling training speed by reducing variance. Models trained with curriculum learning surpass autoregressive models in zero-shot perplexity on 3 of 7 benchmarks. Second, we present Discrete Consistency Distillation, which adapts consistency distillation from the continuous to the discrete setting. This method accelerates sampling by **two orders** of magnitude, while preserving both quality and diversity. The code and the trained models are available at the project page: https://s-sahoo.com/duo
Cite
Text
Sahoo et al. "The Diffusion Duality." ICLR 2025 Workshops: DeLTa, 2025.Markdown
[Sahoo et al. "The Diffusion Duality." ICLR 2025 Workshops: DeLTa, 2025.](https://mlanthology.org/iclrw/2025/sahoo2025iclrw-diffusion/)BibTeX
@inproceedings{sahoo2025iclrw-diffusion,
title = {{The Diffusion Duality}},
author = {Sahoo, Subham Sekhar and Deschenaux, Justin and Gokaslan, Aaron and Wang, Guanghan and Chiu, Justin T and Kuleshov, Volodymyr},
booktitle = {ICLR 2025 Workshops: DeLTa},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/sahoo2025iclrw-diffusion/}
}