Unified Discrete Diffusion for Categorical Data
Abstract
Discrete diffusion models have attracted significant attention for their application to naturally discrete data, such as language and graphs. While discrete-time discrete diffusion has been established for some time, it was only recently that Campbell et al. (2022) introduced the first framework for continuous-time discrete diffusion. However, their training and backward sampling processes significantly differ from those of the discrete-time version, requiring nontrivial approximations for tractability. In this paper, we first introduce a series of generalizations and simplifications of the evidence lower bound (ELBO) that facilitate more accurate and easier optimization both discrete- and continuous-time discrete diffusion. We further establish a unification of discrete- and continuous-time discrete diffusion through shared forward process and backward parameterization. Thanks to this unification, the continuous-time diffusion can now utilize the exact and efficient backward process developed for the discrete-time case, avoiding the need for costly and inexact approximations. Similarly, the discrete-time diffusion now also employ the MCMC corrector, which was previously exclusive to the continuous-time case. Extensive experiments and ablations demonstrate the significant improvement, and we open-source our code at: https://github.com/LingxiaoShawn/USD3.
Cite
Text
Zhao et al. "Unified Discrete Diffusion for Categorical Data." Journal of Machine Learning Research, 2025.Markdown
[Zhao et al. "Unified Discrete Diffusion for Categorical Data." Journal of Machine Learning Research, 2025.](https://mlanthology.org/jmlr/2025/zhao2025jmlr-unified/)BibTeX
@article{zhao2025jmlr-unified,
title = {{Unified Discrete Diffusion for Categorical Data}},
author = {Zhao, Lingxiao and Ding, Xueying and Yu, Lijun and Akoglu, Leman},
journal = {Journal of Machine Learning Research},
year = {2025},
pages = {1-49},
volume = {26},
url = {https://mlanthology.org/jmlr/2025/zhao2025jmlr-unified/}
}