Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

Abstract

Representation learning of pathology whole-slide images (WSIs) has primarily relied on weak supervision with Multiple Instance Learning (MIL). This approach leads to slide representations highly tailored to a specific clinical task. Self-supervised learning (SSL) has been successfully applied to train histopathology foundation models (FMs) for patch embedding generation. However, generating patient or slide level embeddings remains challenging. Existing approaches for slide representation learning extend the principles of SSL from patch level learning to entire slides by aligning different augmentations of the slide or by utilizing multimodal data. By integrating tile embeddings from multiple FMs, we propose a new single modality SSL method in feature space that generates useful slide representations. Our contrastive pretraining strategy, called COBRA, employs multiple FMs and an architecture based on Mamba-2. COBRA exceeds performance of state-of-the-art slide encoders on four different public Clinical Protemic Tumor Analysis Consortium (CPTAC) cohorts on average by at least +4.4% AUC, despite only being pretrained on 3048 WSIs from The Cancer Genome Atlas (TCGA). Additionally, COBRA is readily compatible at inference time with previously unseen feature extractors. Code available at https://github.com/KatherLab/COBRA

Cite

Text

Lenz et al. "Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02869

Markdown

[Lenz et al. "Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/lenz2025cvpr-unsupervised/) doi:10.1109/CVPR52734.2025.02869

BibTeX

@inproceedings{lenz2025cvpr-unsupervised,
  title     = {{Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning}},
  author    = {Lenz, Tim and Neidlinger, Peter and Ligero, Marta and Wölflein, Georg and van Treeck, Marko and Kather, Jakob N.},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {30807-30817},
  doi       = {10.1109/CVPR52734.2025.02869},
  url       = {https://mlanthology.org/cvpr/2025/lenz2025cvpr-unsupervised/}
}