D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images

Abstract

Diffusion-based virtual staining methods of histopathology images have demonstrated outstanding potential for stain normalization and cross-dye staining (e.g., hematoxylin-eosin to immunohistochemistry). However, achieving pathology-correct cross-dye virtual staining with versatile tone controls poses significant challenges due to the difficulty of decoupling the given pathology and tone conditions. This issue would cause non-pathologic regions to be mistakenly stained like pathologic ones, and vice versa, which we term “pathology leakage.” To address this issue, we propose diffusion virtual staining Transformer (D-VST), a new framework with versatile tone control for cross-dye virtual staining. Specifically, we introduce a pathology encoder in conjunction with a tone encoder, combined with a two-stage curriculum learning scheme that decouples pathology and tone conditions, to enable tone control while eliminating pathology leakage. Further, to extend our method for billion-pixel whole slide image (WSI) staining, we introduce a novel frequency-aware adaptive patch sampling strategy for high-quality yet efficient inference of ultra-high resolution images in a zero-shot manner. Integrating these two innovative components facilitates a pathology-correct, tone-controllable, cross-dye WSI virtual staining process. Extensive experiments on three virtual staining tasks that involve translating between four different dyes demonstrate the superiority of our approach in generating high-quality and pathologically accurate images compared to existing methods based on generative adversarial networks and diffusion models. Our code and trained models will be released.

Cite

Text

Yang et al. "D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images." Advances in Neural Information Processing Systems, 2025.

Markdown

[Yang et al. "D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/yang2025neurips-dvst/)

BibTeX

@inproceedings{yang2025neurips-dvst,
  title     = {{D-VST: Diffusion Transformer for Pathology-Correct Tone-Controllable Cross-Dye Virtual Staining of Whole Slide Images}},
  author    = {Yang, Shurong and Wei, Dong and Hu, Yihuang and Peng, Qiong and Liu, Hong and Huang, Yawen and Wu, Xian and Zheng, Yefeng and Wang, Liansheng},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/yang2025neurips-dvst/}
}