Improving SVM Text Classification Performance Through Threshold Adjustment
Abstract
In general, support vector machines (SVM), when applied to text classification provide excellent precision, but poor recall. One means of customizing SVMs to improve recall, is to adjust the threshold associated with an SVM. We describe an automatic process for adjusting the thresholds of generic SVM which incorporates a user utility model, an integral part of an information management system. By using thresholds based on utility models and the ranking properties of classifiers, it is possible to overcome the precision bias of SVMs and insure robust performance in recall across a wide variety of topics, even when training data are sparse. Evaluations on TREC data show that our proposed threshold adjusting algorithm boosts the performance of baseline SVMs by at least 20% for standard information retrieval measures.
Cite
Text
Shanahan and Roma. "Improving SVM Text Classification Performance Through Threshold Adjustment." European Conference on Machine Learning, 2003. doi:10.1007/978-3-540-39857-8_33Markdown
[Shanahan and Roma. "Improving SVM Text Classification Performance Through Threshold Adjustment." European Conference on Machine Learning, 2003.](https://mlanthology.org/ecmlpkdd/2003/shanahan2003ecml-improving/) doi:10.1007/978-3-540-39857-8_33BibTeX
@inproceedings{shanahan2003ecml-improving,
title = {{Improving SVM Text Classification Performance Through Threshold Adjustment}},
author = {Shanahan, James G. and Roma, Norbert},
booktitle = {European Conference on Machine Learning},
year = {2003},
pages = {361-372},
doi = {10.1007/978-3-540-39857-8_33},
url = {https://mlanthology.org/ecmlpkdd/2003/shanahan2003ecml-improving/}
}