Unified Discrete Diffusion for Categorical Data

Abstract

Discrete diffusion models have attracted significant attention for their application to naturally discrete data, such as language and graphs. While discrete-time discrete diffusion has been established for some time, it was only recently that Campbell et al. (2022) introduced the first framework for continuous-time discrete diffusion. However, their training and backward sampling processes significantly differ from those of the discrete-time version, requiring nontrivial approximations for tractability. In this paper, we first introduce a series of generalizations and simplifications of the evidence lower bound (ELBO) that facilitate more accurate and easier optimization both discrete- and continuous-time discrete diffusion. We further establish a unification of discrete- and continuous-time discrete diffusion through shared forward process and backward parameterization. Thanks to this unification, the continuous-time diffusion can now utilize the exact and efficient backward process developed for the discrete-time case, avoiding the need for costly and inexact approximations. Similarly, the discrete-time diffusion now also employ the MCMC corrector, which was previously exclusive to the continuous-time case. Extensive experiments and ablations demonstrate the significant improvement, and we open-source our code at: https://github.com/LingxiaoShawn/USD3.

Cite

Text

Zhao et al. "Unified Discrete Diffusion for Categorical Data." Journal of Machine Learning Research, 2025.

Markdown

[Zhao et al. "Unified Discrete Diffusion for Categorical Data." Journal of Machine Learning Research, 2025.](https://mlanthology.org/jmlr/2025/zhao2025jmlr-unified/)

BibTeX

@article{zhao2025jmlr-unified,
  title     = {{Unified Discrete Diffusion for Categorical Data}},
  author    = {Zhao, Lingxiao and Ding, Xueying and Yu, Lijun and Akoglu, Leman},
  journal   = {Journal of Machine Learning Research},
  year      = {2025},
  pages     = {1-49},
  volume    = {26},
  url       = {https://mlanthology.org/jmlr/2025/zhao2025jmlr-unified/}
}