Optimizing Neural Networks with Gradient Lexicase Selection

Abstract

One potential drawback of using aggregated performance measurement in machine learning is that models may learn to accept higher errors on some training cases as compromises for lower errors on others, with the lower errors actually being instances of overfitting. This can lead both to stagnation at local optima and to poor generalization. Lexicase selection is an uncompromising method developed in evolutionary computation, which selects models on the basis of sequences of individual training case errors instead of using aggregated metrics such as loss and accuracy. In this paper, we investigate how the general idea of lexicase selection can fit into the context of deep learning to improve generalization. We propose Gradient Lexicase Selection, an optimization framework that combines gradient descent and lexicase selection in an evolutionary fashion. Experimental results show that the proposed method improves the generalization performance of various popular deep neural network architectures on three image classification benchmarks. Qualitative analysis also indicates that our method helps the networks learn more diverse representations.

Cite

Text

Ding and Spector. "Optimizing Neural Networks with Gradient Lexicase Selection." International Conference on Learning Representations, 2022.

Markdown

[Ding and Spector. "Optimizing Neural Networks with Gradient Lexicase Selection." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/ding2022iclr-optimizing/)

BibTeX

@inproceedings{ding2022iclr-optimizing,
  title     = {{Optimizing Neural Networks with Gradient Lexicase Selection}},
  author    = {Ding, Li and Spector, Lee},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
  url       = {https://mlanthology.org/iclr/2022/ding2022iclr-optimizing/}
}