Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data

Abstract

Diagnosing and cleaning data is a crucial step for building robust machine learning systems. However, identifying problems within large-scale datasets with real-world distributions is challenging due to the presence of complex issues such as label errors, under-representation, and outliers. In this paper, we propose a unified approach for identifying the problematic data by utilizing a largely ignored source of information: a relational structure of data in the feature-embedded space. To this end, we present scalable and effective algorithms for detecting label errors and outlier data based on the relational graph structure of data. We further introduce a visualization tool that provides contextual information of a data point in the feature-embedded space, serving as an effective tool for interactively diagnosing data. We evaluate the label error and outlier/out-of-distribution (OOD) detection performances of our approach on the large-scale image, speech, and language domain tasks, including ImageNet, ESC-50, and SST2. Our approach achieves state-of-the-art detection performance on all tasks considered and demonstrates its effectiveness in debugging large-scale real-world datasets across various domains. We release codes at https://github.com/snu-mllab/Neural-Relation-Graph.

Cite

Text

Kim et al. "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data." Neural Information Processing Systems, 2023.

Markdown

[Kim et al. "Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/kim2023neurips-neural/)

BibTeX

@inproceedings{kim2023neurips-neural,
  title     = {{Neural Relation Graph: A Unified Framework for Identifying Label Noise and Outlier Data}},
  author    = {Kim, Jang-Hyun and Yun, Sangdoo and Song, Hyun Oh},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/kim2023neurips-neural/}
}