Small-Vote Sample Selection for Label-Noise Learning
Abstract
The small-loss criterion is widely used in recent label-noise learning methods. However, such a criterion only considers the loss of each training sample in a mini-batch but ignores the loss distribution in the whole training set. Moreover, the selection of clean samples depends on a heuristic clean data rate. As a result, some noisy-labeled samples are easily identified as clean ones, and vice versa. In this paper, we propose a novel yet simple sample selection method, which mainly consists of a Hierarchical Voting Scheme (HVS) and an Adaptive Clean data rate Estimation Strategy (ACES), to accurately identify clean samples and noisy-labeled samples for robust learning. Specifically, we propose HVS to effectively combine the global vote and the local vote, so that both epoch-level and batch-level information is exploited to assign a hierarchical vote for each mini-batch sample. Based on HVS, we further develop ACES to adaptively estimate the clean data rate by leveraging a 1D Gaussian Mixture Model (GMM). Experimental results show that our proposed method consistently outperforms several state-of-the-art label-noise learning methods on both synthetic and real-world noisy benchmark datasets.
Cite
Text
Xu et al. "Small-Vote Sample Selection for Label-Noise Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86523-8_44Markdown
[Xu et al. "Small-Vote Sample Selection for Label-Noise Learning." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/xu2021ecmlpkdd-smallvote/) doi:10.1007/978-3-030-86523-8_44BibTeX
@inproceedings{xu2021ecmlpkdd-smallvote,
title = {{Small-Vote Sample Selection for Label-Noise Learning}},
author = {Xu, Youze and Yan, Yan and Xue, Jing-Hao and Lu, Yang and Wang, Hanzi},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2021},
pages = {729-744},
doi = {10.1007/978-3-030-86523-8_44},
url = {https://mlanthology.org/ecmlpkdd/2021/xu2021ecmlpkdd-smallvote/}
}