Heterogeneous Uncertainty Sampling for Supervised Learning
Abstract
Uncertainty sampling methods iteratively request class labels for training instances whose classes are uncertain despite the previous labeled instances. These methods can greatly reduce the number of instances that an expert need label. One problem with this approach is that the classifier best suited for an application may be too expensive to train or use during the selection of instances. We test the use of one classifier (a highly efficient probabilistic one) to select examples for training another (the C4.5 rule induction program). Despite being chosen by this heterogeneous approach, the uncertainty samples yielded classifiers with lower error rates than random samples ten times larger.
Cite
Text
Lewis and Catlett. "Heterogeneous Uncertainty Sampling for Supervised Learning." International Conference on Machine Learning, 1994. doi:10.1016/B978-1-55860-335-6.50026-XMarkdown
[Lewis and Catlett. "Heterogeneous Uncertainty Sampling for Supervised Learning." International Conference on Machine Learning, 1994.](https://mlanthology.org/icml/1994/lewis1994icml-heterogeneous/) doi:10.1016/B978-1-55860-335-6.50026-XBibTeX
@inproceedings{lewis1994icml-heterogeneous,
title = {{Heterogeneous Uncertainty Sampling for Supervised Learning}},
author = {Lewis, David D. and Catlett, Jason},
booktitle = {International Conference on Machine Learning},
year = {1994},
pages = {148-156},
doi = {10.1016/B978-1-55860-335-6.50026-X},
url = {https://mlanthology.org/icml/1994/lewis1994icml-heterogeneous/}
}