Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with Sharp Convergence Bounds Under Minimal Assumptions

Abstract

This paper introduces the Discrete Markov Probabilistic Model (DMPM), a novel algorithm for discrete data generation. The algorithm operates in discrete space, where the noising process is a continuous-time Markov chain that can be sampled exactly via a Poissonian clock that flips labels uniformly at random. The time-reversal process, like the forward noise process, is a jump process, with its intensity governed by a discrete analogue of the classical score function. Crucially, this intensity is proven to be the conditional expectation of a function of the forward process, strengthening its theoretical alignment with score-based generative models while ensuring robustness and efficiency. We further establish convergence bounds for the algorithm under minimal assumptions and demonstrate its effectiveness through experiments on low-dimensional Bernoulli-distributed datasets and high-dimensional binary MNIST data. The results highlight its strong performance in generating discrete structures. This work bridges theoretical foundations and practical applications, advancing the development of effective and theoretically grounded discrete generative modeling.

Cite

Text

Pham et al. "Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with Sharp Convergence Bounds Under Minimal Assumptions." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Pham et al. "Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with Sharp Convergence Bounds Under Minimal Assumptions." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/pham2025icml-discrete/)

BibTeX

@inproceedings{pham2025icml-discrete,
  title     = {{Discrete Markov Probabilistic Models: An Improved Discrete Score-Based Framework with Sharp Convergence Bounds Under Minimal Assumptions}},
  author    = {Pham, Le-Tuyet-Nhi and Shariatian, Dario and Ocello, Antonio and Conforti, Giovanni and Oliviero Durmus, Alain},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {49195-49258},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/pham2025icml-discrete/}
}