Which Is More Effective in Label Noise Cleaning, Correction or Filtering?
Abstract
Most noise cleaning methods adopt one of the correction and filtering modes to build robust models. However, their effectiveness, applicability, and hyper-parameter insensitivity have not been carefully studied. We compare the two cleaning modes via a rebuilt error bound in noisy environments. At the dataset level, Theorem 5 implies that correction is more effective than filtering when the cleaned datasets have close noise rates. At the sample level, Theorem 6 indicates that confident label noises (large noise probabilities) are more suitable to be corrected, and unconfident noises (medium noise probabilities) should be filtered. Besides, an imperfect hyper-parameter may have fewer negative impacts on filtering than correction. Unlike existing methods with a single cleaning mode, the proposed Fusion cleaning framework of Correction and Filtering (FCF) combines the advantages of different modes to deal with diverse suspicious labels. Experimental results demonstrate that our FCF method can achieve state-of-the-art performance on benchmark datasets.
Cite
Text
Jiang et al. "Which Is More Effective in Label Noise Cleaning, Correction or Filtering?." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I11.29183Markdown
[Jiang et al. "Which Is More Effective in Label Noise Cleaning, Correction or Filtering?." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/jiang2024aaai-more/) doi:10.1609/AAAI.V38I11.29183BibTeX
@inproceedings{jiang2024aaai-more,
title = {{Which Is More Effective in Label Noise Cleaning, Correction or Filtering?}},
author = {Jiang, Gaoxia and Zhang, Jia and Bai, Xuefei and Wang, Wenjian and Meng, Deyu},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {12866-12873},
doi = {10.1609/AAAI.V38I11.29183},
url = {https://mlanthology.org/aaai/2024/jiang2024aaai-more/}
}