A Closer Look at Novel Class Discovery from the Labeled Set

Abstract

Novel class discovery (NCD) is to infer novel categories in an unlabeled set using prior knowledge of a labeled set comprising diverse but related classes. Existing research focuses on using the labeled set methodologically and little on analyzing it. In this study, we closer look at NCD from the labeled set and focus on two questions: (i) Given an unlabeled set, \textit{what labeled set best supports novel class discovery?} (ii) A fundamental premise of NCD is that the labeled set must be related to the unlabeled set, but \textit{how can we measure this relation?} For (i), we propose and substantiate the hypothesis that NCD could benefit from a labeled set with high semantic similarity to the unlabeled set. Using ImageNet's hierarchical class structure, we create a large-scale benchmark with variable semantic similarity across labeled/unlabeled datasets. In contrast, existing NCD benchmarks ignore the semantic relation. For (ii), we introduce a mathematical definition for quantifying the semantic similarity between labeled and unlabeled sets. We utilize this metric to validate our established benchmark and demonstrate it highly corresponds with NCD performance. Furthermore, without quantitative analysis, previous works commonly believe that label information is always beneficial. However, counterintuitively, our experimental results show that using labels may lead to sub-optimal outcomes in low-similarity settings.

Cite

Text

Li et al. "A Closer Look at Novel Class Discovery from the Labeled Set." NeurIPS 2022 Workshops: DistShift, 2022.

Markdown

[Li et al. "A Closer Look at Novel Class Discovery from the Labeled Set." NeurIPS 2022 Workshops: DistShift, 2022.](https://mlanthology.org/neuripsw/2022/li2022neuripsw-closer/)

BibTeX

@inproceedings{li2022neuripsw-closer,
  title     = {{A Closer Look at Novel Class Discovery from the Labeled Set}},
  author    = {Li, Ziyun and Otholt, Jona and Dai, Ben and Hu, Di and Meinel, Christoph and Yang, Haojin},
  booktitle = {NeurIPS 2022 Workshops: DistShift},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/li2022neuripsw-closer/}
}