Partially Supervised Classification of Text Documents
Abstract
We investigate the following problem: Given a set of documents of a particular topic or classÈ, and a large setÅof mixed documents that contains documents from classÈand other types of documents, identify the documents from classÈinÅ. The key feature of this problem is that there is no labeled non-Èdocument, which makes traditional machine learning techniques inapplicable, as they all need labeled documents of both classes. We call this problem partially supervised classification. In this paper, we show that this problem can be posed as a constrained optimization problem and that under appropriate conditions, solutions to the constrained optimization problem will give good solutions to the partially supervised classification problem. We present a novel technique to solve the problem and demonstrate the effectiveness of the technique through extensive experimentation. 1.
Cite
Text
Liu et al. "Partially Supervised Classification of Text Documents." International Conference on Machine Learning, 2002.Markdown
[Liu et al. "Partially Supervised Classification of Text Documents." International Conference on Machine Learning, 2002.](https://mlanthology.org/icml/2002/liu2002icml-partially/)BibTeX
@inproceedings{liu2002icml-partially,
title = {{Partially Supervised Classification of Text Documents}},
author = {Liu, Bing and Lee, Wee Sun and Yu, Philip S. and Li, Xiaoli},
booktitle = {International Conference on Machine Learning},
year = {2002},
pages = {387-394},
url = {https://mlanthology.org/icml/2002/liu2002icml-partially/}
}