TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models

Veenboer, Tim; Yiasemis, George; Marcus, Eric; van Veldhuizen, Vivien; Snoek, Cees G. M.; Teuwen, Jonas; Lipman, Kevin B. W. Groot

TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models

Tim Veenboer, George Yiasemis, Eric Marcus, Vivien van Veldhuizen, Cees G. M. Snoek, Jonas Teuwen, Kevin B. W. Groot Lipman

MIDL 2026 pp. 726-753

/midl/2026/veenboer2026midl-tapct/

Abstract

Existing foundation models (FMs) in the medical domain often require extensive fine-tuning or rely on training resource-intensive decoders, while many existing encoders are pretrained with objectives biased toward specific tasks. This illustrates a need for a strong, task-agnostic foundation model that requires minimal fine-tuning beyond feature extraction. In this work, we introduce a suite of task-agnostic pretraining of CT foundation models (TAP-CT): a simple yet effective adaptation of Vision Transformers (ViTs) and DINOv2 for volumetric data, enabling scalable self-supervised pretraining directly on 3D CT volumes. Our approach incorporates targeted modifications to patch embeddings, positional encodings, and volumetric augmentations, making the architecture depth-aware while preserving the simplicity of the underlying architectures. We show that large-scale 3D pretraining on an extensive in-house CT dataset (105K volumes) yields stable, robust frozen representations that generalize strongly across downstream tasks. To promote transparency and reproducibility, and to establish a powerful, low-resource baseline for future research in medical imaging, we will release all pretrained models, experimental configurations, and downstream benchmark code at .

PDF MIDL Semantic Scholar

Cite

Text

Veenboer et al. "TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models." Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, 2026.

Markdown

[Veenboer et al. "TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models." Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, 2026.](https://mlanthology.org/midl/2026/veenboer2026midl-tapct/)

BibTeX

@inproceedings{veenboer2026midl-tapct,
  title     = {{TAP-CT: 3D Task-Agnostic Pretraining of Computed Tomography Foundation Models}},
  author    = {Veenboer, Tim and Yiasemis, George and Marcus, Eric and van Veldhuizen, Vivien and Snoek, Cees G. M. and Teuwen, Jonas and Lipman, Kevin B. W. Groot},
  booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning},
  year      = {2026},
  pages     = {726-753},
  volume    = {315},
  url       = {https://mlanthology.org/midl/2026/veenboer2026midl-tapct/}
}