From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models
Abstract
Foundation models pretrained on web-scale data drive contemporary transfer learning in vision, language, and multimodal tasks. Recent work shows that mild label noise in these corpora may lift in-distribution accuracy yet sharply reduce out-of-distribution generalization, an effect known as catastrophic inheritance. Medical data is especially sensitive because annotations are scarce, domain shifts are large, and pretraining sources are noisy. We present the first systematic analysis of catastrophic inheritance in medical models. Controlled label-corruption experiments expose a clear structural collapse: as noise rises, the skewness and kurtosis of feature and logit distributions decline, signaling a flattened representation space and diminished discriminative detail. These higher-order statistics form a compact, interpretable marker of degradation in fine-grained tasks such as histopathology. Guided by this finding, we introduce a fine-tuning objective that restores skewness and kurtosis through two scalar regularizers added to the task loss. The method leaves the backbone unchanged and incurs negligible overhead. Tests on PLIP models trained with Twitter pathology images, as well as other large-scale vision and language backbones, show consistent gains in robustness and cross-domain accuracy under varied noise levels.
Cite
Text
Sun et al. "From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models." Advances in Neural Information Processing Systems, 2025.Markdown
[Sun et al. "From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/sun2025neurips-pretraining/)BibTeX
@inproceedings{sun2025neurips-pretraining,
title = {{From Pretraining to Pathology: How Noise Leads to Catastrophic Inheritance in Medical Models}},
author = {Sun, Hao and Han, Zhongyi and Chen, Hao and Wang, Jindong and Gao, Xin and Yin, Yilong},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/sun2025neurips-pretraining/}
}