Learning Distinct Features Helps, Provably

Abstract

We study the diversity of the features learned by a two-layer neural network trained with the least squares loss. We measure the diversity by the average $L_2$-distance between the hidden-layer features and theoretically investigate how learning non-redundant distinct features affects the performance of the network. To do so, we derive novel generalization bounds depending on feature diversity based on Rademacher complexity for such networks. Our analysis proves that more distinct features at the network's units within the hidden layer lead to better generalization. We also show how to extend our results to deeper networks and different losses.

Cite

Text

Laakom et al. "Learning Distinct Features Helps, Provably." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023. doi:10.1007/978-3-031-43415-0_13

Markdown

[Laakom et al. "Learning Distinct Features Helps, Provably." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2023.](https://mlanthology.org/ecmlpkdd/2023/laakom2023ecmlpkdd-learning/) doi:10.1007/978-3-031-43415-0_13

BibTeX

@inproceedings{laakom2023ecmlpkdd-learning,
  title     = {{Learning Distinct Features Helps, Provably}},
  author    = {Laakom, Firas and Raitoharju, Jenni and Iosifidis, Alexandros and Gabbouj, Moncef},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2023},
  pages     = {206-222},
  doi       = {10.1007/978-3-031-43415-0_13},
  url       = {https://mlanthology.org/ecmlpkdd/2023/laakom2023ecmlpkdd-learning/}
}