On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality

Abstract

We investigate the approximation and estimation rates of conditional diffusion transformers (DiTs) with classifier-free guidance. We present a comprehensive analysis for “in-context” conditional DiTs under various common assumptions: generic and strong Hölder, linear latent (subspace), and Lipschitz score function assumptions. Importantly, we establish minimax optimality of DiTs by leveraging score function regularity. Specifically, we discretize the input domains into infinitesimal grids and then perform term-by-term Taylor expansions on the conditional diffusion score function under the Hölder smooth data assumption. This enables fine-grained use of transformers’ universal approximation through a more detailed piecewise constant approximation, and hence obtains tighter bounds. Additionally, we extend our analysis to latent settings. Our findings establish statistical limits for DiTs and offer practical guidance toward more efficient and accurate designs.

Cite

Text

Hu et al. "On Statistical Rates of Conditional Diffusion Transformers: Approximation,  Estimation and Minimax Optimality." International Conference on Learning Representations, 2025.

Markdown

[Hu et al. "On Statistical Rates of Conditional Diffusion Transformers: Approximation,  Estimation and Minimax Optimality." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/hu2025iclr-statistical/)

BibTeX

@inproceedings{hu2025iclr-statistical,
  title     = {{On Statistical Rates of Conditional Diffusion Transformers: Approximation,  Estimation and Minimax Optimality}},
  author    = {Hu, Jerry Yao-Chieh and Wu, Weimin and Lee, Yi-Chen and Huang, Yu-Chao and Chen, Minshuo and Liu, Han},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/hu2025iclr-statistical/}
}