Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It
Abstract
Label smoothing (LS) is a popular regularisation method for training neural networks as it is effective in improving test accuracy and is simple to implement. ''Hard'' one-hot labels are ''smoothed'' by uniformly distributing probability mass to other classes, reducing overfitting. Prior work has shown that in some cases *LS can degrade selective classification (SC)* -- where the aim is to reject misclassifications using a model's uncertainty. In this work, we first demonstrate empirically across an extended range of large-scale tasks and architectures that LS *consistently* degrades SC. We then address a gap in existing knowledge, providing an *explanation* for this behaviour by analysing logit-level gradients: LS degrades the uncertainty rank ordering of correct vs incorrect predictions by regularising the max logit *more* when a prediction is likely to be correct, and *less* when it is likely to be wrong. This elucidates previously reported experimental results where strong classifiers underperform in SC. We then demonstrate the empirical effectiveness of post-hoc *logit normalisation* for recovering lost SC performance caused by LS. Furthermore, linking back to our gradient analysis, we again provide an explanation for why such normalisation is effective.
Cite
Text
Xia et al. "Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It." International Conference on Learning Representations, 2025.Markdown
[Xia et al. "Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/xia2025iclr-understanding/)BibTeX
@inproceedings{xia2025iclr-understanding,
title = {{Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It}},
author = {Xia, Guoxuan and Laurent, Olivier and Franchi, Gianni and Bouganis, Christos-Savvas},
booktitle = {International Conference on Learning Representations},
year = {2025},
url = {https://mlanthology.org/iclr/2025/xia2025iclr-understanding/}
}