Enhancing Supervised Learning with Unlabeled Data

Abstract

In many practical learning scenarios, there is a small amount of labeled data along with a large pool of unlabeled data. Many supervised learning algorithms have been developed and extensively studied. We present a new "co-training" strategy for using unlabeled data to improve the performance of standard supervised learning algorithms. Unlike much of the prior work, such as the co-training procedure of Blum and Mitchell (1998), we do not assume there are two redundant views both of which are sufficient for perfect classification. The only requirement our co-training strategy places on each supervised learning algorithm is that its hypothesis partitions the example space into a set of equivalence classes (e.g. for a decision tree each leaf defines an equivalence class). We evaluate our co-training strategy via experiments using data from the UCI repository. 1. Introduction In many practical learning scenarios, there is a small amount of labeled data along with a lar...

Cite

Text

Goldman and Zhou. "Enhancing Supervised Learning with Unlabeled Data." International Conference on Machine Learning, 2000.

Markdown

[Goldman and Zhou. "Enhancing Supervised Learning with Unlabeled Data." International Conference on Machine Learning, 2000.](https://mlanthology.org/icml/2000/goldman2000icml-enhancing/)

BibTeX

@inproceedings{goldman2000icml-enhancing,
  title     = {{Enhancing Supervised Learning with Unlabeled Data}},
  author    = {Goldman, Sally A. and Zhou, Yan},
  booktitle = {International Conference on Machine Learning},
  year      = {2000},
  pages     = {327-334},
  url       = {https://mlanthology.org/icml/2000/goldman2000icml-enhancing/}
}