Supervised Learning and Co-Training
Abstract
Co-training under the Conditional Independence Assumption is among the models which demonstrate how radically the need for labeled data can be reduced if a huge amount of unlabeled data is available. In this paper, we explore how much credit for this saving must be assigned solely to the extra-assumptions underlying the Co-training model. To this end, we compute general (almost tight) upper and lower bounds on the sample size needed to achieve the success criterion of PAC-learning within the model of Co-training under the Conditional Independence Assumption in a purely supervised setting. The upper bounds lie significantly below the lower bounds for PAC-learning without Co-training. Thus, Co-training saves labeled data even when not combined with unlabeled data. On the other hand, the saving is much less radical than the known savings in the semi-supervised setting.
Cite
Text
Darnstädt et al. "Supervised Learning and Co-Training." International Conference on Algorithmic Learning Theory, 2011. doi:10.1007/978-3-642-24412-4_33Markdown
[Darnstädt et al. "Supervised Learning and Co-Training." International Conference on Algorithmic Learning Theory, 2011.](https://mlanthology.org/alt/2011/darnstadt2011alt-supervised/) doi:10.1007/978-3-642-24412-4_33BibTeX
@inproceedings{darnstadt2011alt-supervised,
title = {{Supervised Learning and Co-Training}},
author = {Darnstädt, Malte and Simon, Hans Ulrich and Szörényi, Balázs},
booktitle = {International Conference on Algorithmic Learning Theory},
year = {2011},
pages = {425-439},
doi = {10.1007/978-3-642-24412-4_33},
url = {https://mlanthology.org/alt/2011/darnstadt2011alt-supervised/}
}