Differentiable Top-K Classification Learning

Abstract

The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a family of differentiable top-k cross-entropy classification losses. This allows training while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed losses for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k not only produces better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

Cite

Text

Petersen et al. "Differentiable Top-K Classification Learning." International Conference on Machine Learning, 2022.

Markdown

[Petersen et al. "Differentiable Top-K Classification Learning." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/petersen2022icml-differentiable/)

BibTeX

@inproceedings{petersen2022icml-differentiable,
  title     = {{Differentiable Top-K Classification Learning}},
  author    = {Petersen, Felix and Kuehne, Hilde and Borgelt, Christian and Deussen, Oliver},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {17656-17668},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/petersen2022icml-differentiable/}
}