Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing

Abstract

Masked diffusion models (MDMs) offer a compelling alternative to autoregres- sive models (ARMs) for discrete text generation because they enable parallel token sampling, rather than sequential, left-to-right generation. This means po- tentially much faster inference. However, effective parallel sampling faces two competing requirements: (i) simultaneously updated tokens must be conditionally independent, and (ii) updates should prioritise high-confidence predictions. These goals conflict because high-confidence predictions often cluster and depend on each other, opportunities for parallel updates. We present PUNT, a model-agnostic sampler that reconciles this trade-off. Our method identifies token dependencies and removes lower-confidence tokens from conflicting groups. This produces sets of indices for unmasking that satisfy both independence and confidence criteria. Our approach ensures improved parallel unmasking through approximate conditional independence testing. Our experiments show that PUNT delivers a superior trade-off between accuracy and compute when compared to other strong training-free baselines, especially for generation of longer sequences. On the IFEval benchmark, it achieves up to 16% higher accuracy over baseline methods, including sequential generation (one-by- one). These gains hold across different values of hyperparameters, mitigating the need for brittle hyperparameter tuning. Moreover, we observe that PUNT induces an emergent hierarchical generation strategy, where the model first establishes high-level paragraph structure before local refinement, suggesting a planning-like generation process that contributes to strong alignment performance.

Cite

Text

Azangulov et al. "Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing." International Conference on Learning Representations, 2026.

Markdown

[Azangulov et al. "Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/azangulov2026iclr-parallel/)

BibTeX

@inproceedings{azangulov2026iclr-parallel,
  title     = {{Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing}},
  author    = {Azangulov, Iskander and Pandeva, Teodora and Prasad, Niranjani and Zazo, Javier and Karmalkar, Sushrut},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/azangulov2026iclr-parallel/}
}