DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models
Abstract
A major challenge in using diffusion models is aligning outputs with user-defined conditions. Existing conditional generation methods fall into two major categories: classifier-based guidance, which requires differentiable target models and gradient-based correction; and classifier-free guidance, which embeds conditions directly into the diffusion model but demands expensive joint training and architectural coupling. In this work, we introduce a third paradigm: DISCrete nOise (DISCO) guidance, which replaces the continuous conditional correction term with a finite codebook of discrete noise vectors sampled from a Gaussian prior. Conditional generation is reformulated as a code selection task, and we train prediction network to choose the optimal code given the intermediate diffusion state and the conditioning input. Our approach is differentiability-free, and training-efficient, avoiding the gradient computation and architectural redundancy of prior methods. Empirical results demonstrate that DISCO achieves competitive controllability while substantially reducing resource demands, positioning it as a scalable and effective alternative for conditional diffusion generation.
Cite
Text
Dai et al. "DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models." Advances in Neural Information Processing Systems, 2025.Markdown
[Dai et al. "DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/dai2025neurips-disco/)BibTeX
@inproceedings{dai2025neurips-disco,
title = {{DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models}},
author = {Dai, Longquan and Ming, Wu and Xue, Dejiao and Wang, He and Tang, Jinhui},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/dai2025neurips-disco/}
}