Investigating the Effect of Novel Classes in Semi-Supervised Learning
Abstract
Semi-supervised learning usually assumes the distribution of the unlabelled data to be the same as that of the labelled data. This assumption does not always hold in practice. We empirically show that unlabelled data containing novel examples and classes from outside the distribution of the labelled data can lead to a performance degradation for semi-supervised learning algorithms. We propose a 1-nearest-neighbour based method to assign a weight to each unlabelled example in order to reduce the negative effect of novel classes in unlabelled data. Experimental results on MNIST, Fashion-MNIST and CIFAR-10 datasets suggest that the negative effect of novel classes becomes statistically insignificant when the proposed method is applied. Using our proposed technique, models trained on unlabelled data with novel classes can achieve similar performance as ones trained on clean unlabelled data.
Cite
Text
Peng et al. "Investigating the Effect of Novel Classes in Semi-Supervised Learning." Proceedings of The Eleventh Asian Conference on Machine Learning, 2019.Markdown
[Peng et al. "Investigating the Effect of Novel Classes in Semi-Supervised Learning." Proceedings of The Eleventh Asian Conference on Machine Learning, 2019.](https://mlanthology.org/acml/2019/peng2019acml-investigating/)BibTeX
@inproceedings{peng2019acml-investigating,
title = {{Investigating the Effect of Novel Classes in Semi-Supervised Learning}},
author = {Peng, Alex Yuxuan and Koh, Yun Sing and Riddle, Patricia and Pfahringer, Bernhard},
booktitle = {Proceedings of The Eleventh Asian Conference on Machine Learning},
year = {2019},
pages = {615-630},
volume = {101},
url = {https://mlanthology.org/acml/2019/peng2019acml-investigating/}
}