Part-of-Speech Tagging Using Decision Trees

Màrquez, Lluís; Rodríguez, Horacio

doi:10.1007/BFB0026668

Part-of-Speech Tagging Using Decision Trees

Lluís Màrquez, Horacio Rodríguez

ECML-PKDD 1998 pp. 25-36

doi:10.1007/BFB0026668 /ecmlpkdd/1998/marquez1998ecml-partofspeech/

Abstract

We have applied inductive learning of statistical decision trees to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). Previous work showed that the acquired language models are independent enough to be easily incorporated, as a statistical core of rules, in any flexible tagger. They are also complete enough to be directly used as sets of POS disambiguation rules. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. In this paper we basically address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation. In addition we also face the problem of dealing with unknown words under the same conditions of lacking training examples. In this case some comparative results and comments about close related work are reported.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Màrquez and Rodríguez. "Part-of-Speech Tagging Using Decision Trees." European Conference on Machine Learning, 1998. doi:10.1007/BFB0026668

Markdown

[Màrquez and Rodríguez. "Part-of-Speech Tagging Using Decision Trees." European Conference on Machine Learning, 1998.](https://mlanthology.org/ecmlpkdd/1998/marquez1998ecml-partofspeech/) doi:10.1007/BFB0026668

BibTeX

@inproceedings{marquez1998ecml-partofspeech,
  title     = {{Part-of-Speech Tagging Using Decision Trees}},
  author    = {Màrquez, Lluís and Rodríguez, Horacio},
  booktitle = {European Conference on Machine Learning},
  year      = {1998},
  pages     = {25-36},
  doi       = {10.1007/BFB0026668},
  url       = {https://mlanthology.org/ecmlpkdd/1998/marquez1998ecml-partofspeech/}
}