Hierarchical Classification at Multiple Operating Points

Abstract

Many classification problems consider classes that form a hierarchy. Classifiers that are aware of this hierarchy may be able to make confident predictions at a coarse level despite being uncertain at the fine-grained level. While it is generally possible to vary the granularity of predictions using a threshold at inference time, most contemporary work considers only leaf-node prediction, and almost no prior work has compared methods at multiple operating points. We present an efficient algorithm to produce operating characteristic curves for any method that assigns a score to every class in the hierarchy. Applying this technique to evaluate existing methods reveals that top-down classifiers are dominated by a naive flat softmax classifier across the entire operating range. We further propose two novel loss functions and show that a soft variant of the structured hinge loss is able to significantly outperform the flat baseline. Finally, we investigate the poor accuracy of top-down classifiers and demonstrate that they perform relatively well on unseen classes.

Cite

Text

Valmadre. "Hierarchical Classification at Multiple Operating Points." Neural Information Processing Systems, 2022.

Markdown

[Valmadre. "Hierarchical Classification at Multiple Operating Points." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/valmadre2022neurips-hierarchical/)

BibTeX

@inproceedings{valmadre2022neurips-hierarchical,
  title     = {{Hierarchical Classification at Multiple Operating Points}},
  author    = {Valmadre, Jack},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/valmadre2022neurips-hierarchical/}
}