Class-Wise Generalization Error: An Information-Theoretic Analysis
Abstract
Existing generalization theories for supervised learning typically take a holistic approach and provide bounds for the expected generalization over the whole data distribution, which implicitly assumes that the model generalizes similarly for all different classes. In practice, however, there are significant variations in generalization performance among different classes, which cannot be captured by the existing generalization bounds. In this work, we tackle this problem by theoretically studying the class-generalization error, which quantifies the generalization performance of the model for each individual class. We derive a novel information-theoretic bound for class-generalization error using the KL divergence, and we further obtain several tighter bounds using recent advances in conditional mutual information bound, which enables practical evaluation. We empirically validate our proposed bounds in various neural networks and show that they accurately capture the complex class-generalization behavior. Moreover, we demonstrate that the theoretical tools developed in this work can be applied in several other applications.
Cite
Text
Laakom et al. "Class-Wise Generalization Error: An Information-Theoretic Analysis." Transactions on Machine Learning Research, 2025.Markdown
[Laakom et al. "Class-Wise Generalization Error: An Information-Theoretic Analysis." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/laakom2025tmlr-classwise/)BibTeX
@article{laakom2025tmlr-classwise,
title = {{Class-Wise Generalization Error: An Information-Theoretic Analysis}},
author = {Laakom, Firas and Gabbouj, Moncef and Schmidhuber, Jürgen and Bu, Yuheng},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/laakom2025tmlr-classwise/}
}