Comparative Generalization Bounds for Deep Neural Networks

Abstract

In this work, we investigate the generalization capabilities of deep neural networks. We introduce a novel measure of the effective depth of neural networks, defined as the first layer at which sample embeddings are separable using the nearest-class center classifier. Our empirical results demonstrate that, in standard classification settings, neural networks trained using Stochastic Gradient Descent (SGD) tend to have small effective depths. We also explore the relationship between effective depth, the complexity of the training dataset, and generalization. For instance, we find that the effective depth of a trained neural network increases as the proportion of random labels in the data rises. Finally, we derive a generalization bound by comparing the effective depth of a network with the minimal depth required to fit the same dataset with partially corrupted labels. This bound provides non-vacuous predictions of test performance and is found to be empirically independent of the actual depth of the network.

Cite

Text

Galanti et al. "Comparative Generalization Bounds for Deep Neural Networks." Transactions on Machine Learning Research, 2023.

Markdown

[Galanti et al. "Comparative Generalization Bounds for Deep Neural Networks." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/galanti2023tmlr-comparative/)

BibTeX

@article{galanti2023tmlr-comparative,
  title     = {{Comparative Generalization Bounds for Deep Neural Networks}},
  author    = {Galanti, Tomer and Galanti, Liane and Ben-Shaul, Ido},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/galanti2023tmlr-comparative/}
}