Improving SVM Text Classification Performance Through Threshold Adjustment

Abstract

In general, support vector machines (SVM), when applied to text classification provide excellent precision, but poor recall. One means of customizing SVMs to improve recall, is to adjust the threshold associated with an SVM. We describe an automatic process for adjusting the thresholds of generic SVM which incorporates a user utility model, an integral part of an information management system. By using thresholds based on utility models and the ranking properties of classifiers, it is possible to overcome the precision bias of SVMs and insure robust performance in recall across a wide variety of topics, even when training data are sparse. Evaluations on TREC data show that our proposed threshold adjusting algorithm boosts the performance of baseline SVMs by at least 20% for standard information retrieval measures.

Cite

Text

Shanahan and Roma. "Improving SVM Text Classification Performance Through Threshold Adjustment." European Conference on Machine Learning, 2003. doi:10.1007/978-3-540-39857-8_33

Markdown

[Shanahan and Roma. "Improving SVM Text Classification Performance Through Threshold Adjustment." European Conference on Machine Learning, 2003.](https://mlanthology.org/ecmlpkdd/2003/shanahan2003ecml-improving/) doi:10.1007/978-3-540-39857-8_33

BibTeX

@inproceedings{shanahan2003ecml-improving,
  title     = {{Improving SVM Text Classification Performance Through Threshold Adjustment}},
  author    = {Shanahan, James G. and Roma, Norbert},
  booktitle = {European Conference on Machine Learning},
  year      = {2003},
  pages     = {361-372},
  doi       = {10.1007/978-3-540-39857-8_33},
  url       = {https://mlanthology.org/ecmlpkdd/2003/shanahan2003ecml-improving/}
}