Learning from Massive Noisy Labeled Data for Image Classification

Abstract

Large-scale supervised datasets are crucial to train convolutional neural networks (CNNs) for various computer vision problems. However, obtaining a massive amount of well-labeled data is usually very expensive and time consuming. In this paper, we introduce a general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels. We model the relationships between images, class labels and label noises with a probabilistic graphical model and further integrate it into an end-to-end deep learning system. To demonstrate the effectiveness of our approach, we collect a large-scale real-world clothing classification dataset with both noisy and clean labels. Experiments on this dataset indicate that our approach can better correct the noisy labels and improves the performance of trained CNNs.

Cite

Text

Xiao et al. "Learning from Massive Noisy Labeled Data for Image Classification." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7298885

Markdown

[Xiao et al. "Learning from Massive Noisy Labeled Data for Image Classification." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/xiao2015cvpr-learning/) doi:10.1109/CVPR.2015.7298885

BibTeX

@inproceedings{xiao2015cvpr-learning,
  title     = {{Learning from Massive Noisy Labeled Data for Image Classification}},
  author    = {Xiao, Tong and Xia, Tian and Yang, Yi and Huang, Chang and Wang, Xiaogang},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2015},
  doi       = {10.1109/CVPR.2015.7298885},
  url       = {https://mlanthology.org/cvpr/2015/xiao2015cvpr-learning/}
}