Jensen-Tsallis Divergence for Supervised Classification Under Data Imbalance
Abstract
Abstract In supervised classification problems using Deep Neural Networks, the loss function is typically based on the Kullback–Leibler divergence. However, alternative entropic divergence formulations, such as the Jensen–Shannon Divergence (JSD), have recently garnered attention for their unique properties. In this study, we delve deeper into the interpretation of the JSD and its generalized form, the Jensen–Tsallis Divergence (JTD), as alternative loss functions for supervised classification. When provided with one-hot encoded distributions for the true label probabilities, we demonstrate that these novel divergences impose an intrinsic output confidence regularization that prevents overfitting. Additionally, the q non-extensive parameter of the JTD directly influences the structure of the regularizer, offering increased flexibility in the formulation of the loss function. Through experiments conducted on artificially imbalanced versions of MNIST, Fashion-MNIST, SVHN and CIFAR-10 we showcase how JTD outperforms JSD and other traditional loss functions in terms of generalization performance, especially for highly imbalanced datasets.
Cite
Text
Squicciarini et al. "Jensen-Tsallis Divergence for Supervised Classification Under Data Imbalance." Machine Learning, 2025. doi:10.1007/S10994-025-06791-4Markdown
[Squicciarini et al. "Jensen-Tsallis Divergence for Supervised Classification Under Data Imbalance." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/squicciarini2025mlj-jensentsallis/) doi:10.1007/S10994-025-06791-4BibTeX
@article{squicciarini2025mlj-jensentsallis,
title = {{Jensen-Tsallis Divergence for Supervised Classification Under Data Imbalance}},
author = {Squicciarini, Antonio and Trigano, Tom and Luengo, David},
journal = {Machine Learning},
year = {2025},
pages = {162},
doi = {10.1007/S10994-025-06791-4},
volume = {114},
url = {https://mlanthology.org/mlj/2025/squicciarini2025mlj-jensentsallis/}
}