NLP-Driven IR: Evaluating Performances over a Text Classification Task
Abstract
Although several attempts have been made to introduce Natural Language Processing (NLP) techniques in Information Retrieval, most ones failed to prove their effectiveness in increasing performances. In this paper Text Classification (TC) has been taken as the IR task and the effect of linguistic capabilities of the underlying system have been studied. A novel model for TC, extending a well know statistical model (i.e. Rocchio’s formula [Ittner et al., 1995]) and applied to linguistic features has been defined and experimented. The proposed model represents an effective feature selection methodology. All the experiments result in a significant improvement with respect to other purely statistical methods (e.g. [Yang, 1999]), thus stressing the relevance of the available linguistic information. Moreover, the derived classifier reachs the performance (about 85%) of the best known models (i.e. Support Vector Machines (SVM) and-Nearest Neighbour (KNN)) characterized by an higher computational complexity for training and processing.
Cite
Text
Basili et al. "NLP-Driven IR: Evaluating Performances over a Text Classification Task." International Joint Conference on Artificial Intelligence, 2001.Markdown
[Basili et al. "NLP-Driven IR: Evaluating Performances over a Text Classification Task." International Joint Conference on Artificial Intelligence, 2001.](https://mlanthology.org/ijcai/2001/basili2001ijcai-nlp/)BibTeX
@inproceedings{basili2001ijcai-nlp,
title = {{NLP-Driven IR: Evaluating Performances over a Text Classification Task}},
author = {Basili, Roberto and Moschitti, Alessandro and Pazienza, Maria Teresa},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2001},
pages = {1286-1294},
url = {https://mlanthology.org/ijcai/2001/basili2001ijcai-nlp/}
}