‌Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification

Abstract

Multiple Instance Learning (MIL) is a standard weakly supervised approach for Whole Slide Image (WSI) classification, where performance hinges on both feature representation and MIL pooling strategies. Recent research has predominantly focused on Transformer-based architectures adapted for WSIs. However, we argue that this trend faces a fundamental limitation: data scarcity. In typical settings, Transformer models yield only marginal gains without access to large-scale datasets—resources that are virtually inaccessible to all but a few well-funded research labs. Motivated by this, we revisit simple, non-attention MIL with unsupervised slide features and analyze temperature-$\beta$-controlled log-sum-exp (LSE) pooling. For slides partitioned into $N$ patches, we theoretically show that LSE has a smooth transition at a critical $\beta_{\mathrm{crit}}=\mathcal{O}(\log N)$ threshold, interpolating between mean-like aggregation (stable, better generalization but less sensitive) and max-like aggregation (more sensitive but looser generalization bounds). Grounded in this analysis, we introduce Maxsoft—a novel MIL pooling function that enables flexible control over this trade-off, allowing adaptation to specific tasks and datasets. To further tackle real-world deployment challenges such as specimen heterogeneity, we propose PerPatch augmentation—a simple yet effective technique that enhances model robustness. Empirically, Maxsoft achieves state-of-the-art performance in low-data regimes across four major benchmarks (CAMELYON16, CAMELYON17, TCGA-Lung, and SICAP-MIL), often matching or surpassing large-scale foundation models. When combined with PerPatch augmentation, this performance is further improved through increased robustness. Code is available at \href{https://github.com/jafarinia/maxsoft}{\texttt{https://github.com/jafarinia/maxsoft}}

Cite

Text

Jafarinia et al. "‌Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification." Advances in Neural Information Processing Systems, 2025.

Markdown

[Jafarinia et al. "‌Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/jafarinia2025neurips-navigating/)

BibTeX

@inproceedings{jafarinia2025neurips-navigating,
  title     = {{‌Navigating the MIL Trade-Off: Flexible Pooling for Whole Slide Image Classification}},
  author    = {Jafarinia, Hossein and Hamdi, Danial and Alamdar, Amirhossein and Zahiri, Elahe and Tabar, Soroush Vafaie and Alipanah, Alireza and Mirzaie, Nahal and Razavi, Saeed and Najafi, Amir and Rohban, Mohammad Hossein},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/jafarinia2025neurips-navigating/}
}