From Label Smoothing to Label Relaxation
Abstract
Regularization of (deep) learning models can be realized at the model, loss, or data level. As a technique somewhere in-between loss and data, label smoothing turns deterministic class labels into probability distributions, for example by uniformly distributing a certain part of the probability mass over all classes. A predictive model is then trained on these distributions as targets, using cross-entropy as loss function. While this method has shown improved performance compared to non-smoothed cross-entropy, we argue that the use of a smoothed though still precise probability distribution as a target can be questioned from a theoretical perspective. As an alternative, we propose a generalized technique called label relaxation, in which the target is a set of probabilities represented in terms of an upper probability distribution. This leads to a genuine relaxation of the target instead of a distortion, thereby reducing the risk of incorporating an undesirable bias in the learning process. Methodically, label relaxation leads to the minimization of a novel type of loss function, for which we propose a suitable closed-form expression for model optimization. The effectiveness of the approach is demonstrated in an empirical study on image data.
Cite
Text
Lienen and Hüllermeier. "From Label Smoothing to Label Relaxation." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I10.17041Markdown
[Lienen and Hüllermeier. "From Label Smoothing to Label Relaxation." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/lienen2021aaai-label/) doi:10.1609/AAAI.V35I10.17041BibTeX
@inproceedings{lienen2021aaai-label,
title = {{From Label Smoothing to Label Relaxation}},
author = {Lienen, Julian and Hüllermeier, Eyke},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {8583-8591},
doi = {10.1609/AAAI.V35I10.17041},
url = {https://mlanthology.org/aaai/2021/lienen2021aaai-label/}
}