Flow-MIL: Constructing Highly-Expressive Latent Feature Space for Whole Slide Image Classification Using Normalizing Flow

Yingfan Ma, Bohan An, Ao Shen, Mingzhi Yuan, Minghong Duan, Manning Wang

ICCV 2025 pp. 23561-23570

/iccv/2025/ma2025iccv-flowmil/

Abstract

Whole Slide Image (WSI) classification has been widely used in pathological diagnosis and prognosis prediction, and it is commonly formulated as a weakly-supervised Multiple Instance Learning (MIL) problem because of the large size of WSIs and the difficulty of obtaining fine-grained annotations. In the MIL formulation, a WSI is treated as a bag and the patches cut from it are treated as its instances, and most existing methods first extract instance features and then aggregate them into bag feature using attention-based mechanism for bag-level prediction. These models are trained using only bag-level labels, so they often lack instance-level insights and lose detailed semantic information, which limits their bag-level classification performance and damages their ability to explore high-expressive information. In this paper, we propose Flow-MIL, which leverages normalizing flow-based Latent Semantic Embedding Space (LSES) to enhance feature representation. By mapping patches into the simple and highly-expressive latent space LSES, Flow-MIL achieves effective slide-level aggregation while preserving critical semantic information. We also introduce Gaussian Mixture Model-based Latent Semantic Prototypes (LSP) within the LSES to capture class-specific pathological distribution for each class and refine pseudo instance labels. Extensive experiments on three public WSI datasets show that Flow-MIL outperforms recent SOTA methods in both bag-level and instance-level classification and offers improved interpretability.

PDF ICCV Semantic Scholar

Cite

Text

Ma et al. "Flow-MIL: Constructing Highly-Expressive Latent Feature Space for Whole Slide Image Classification Using Normalizing Flow." International Conference on Computer Vision, 2025.

Markdown

[Ma et al. "Flow-MIL: Constructing Highly-Expressive Latent Feature Space for Whole Slide Image Classification Using Normalizing Flow." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/ma2025iccv-flowmil/)

BibTeX

@inproceedings{ma2025iccv-flowmil,
  title     = {{Flow-MIL: Constructing Highly-Expressive Latent Feature Space for Whole Slide Image Classification Using Normalizing Flow}},
  author    = {Ma, Yingfan and An, Bohan and Shen, Ao and Yuan, Mingzhi and Duan, Minghong and Wang, Manning},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {23561-23570},
  url       = {https://mlanthology.org/iccv/2025/ma2025iccv-flowmil/}
}