The Role of Lexicalization and Pruning for Base Noun Phrase Grammars

Abstract

This paper explores the role of lexicalization and prun-ing of grammars for base noun phrase identification. We modify our original framework (Cardie & Pierce 1998) to extract lexicalized treebank grammars that assign a score to each potential noun phrase based upon both the part-of-speech tag sequence and the word sequence of the phrase. We evaluate the mod-ified framework on the “simple ” and “complex ” base NP corpora of the original study. As expected, we find that lexicalization dramatically improves the perfor-mance of the unpruned treebank grammars; however, for the simple base noun phrase data set, the lexical-ized grammar performs below the corresponding unlex-icalized but pruned grammar, suggesting that lexical-ization is not critical for recognizing very simple, rel-atively unambiguous constituents. Somewhat surpris-ingly, we also find that error-driven pruning improves the performance of the probabilistic, lexicalized base noun phrase grammars by up to 1.0 % recall and 0.4% precision, and does so even using the original pruning strategy that fails to distinguish the effects of lexical-ization. This result may have implications for many probabilistic grammar-based approaches to problems in natural language processing: error-driven pruning is a remarkably robust method for improving the perfor-mance of probabilistic and non-probabilistic grammars alike.

Cite

Text

Cardie and Pierce. "The Role of Lexicalization and Pruning for Base Noun Phrase Grammars." AAAI Conference on Artificial Intelligence, 1999.

Markdown

[Cardie and Pierce. "The Role of Lexicalization and Pruning for Base Noun Phrase Grammars." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/cardie1999aaai-role/)

BibTeX

@inproceedings{cardie1999aaai-role,
  title     = {{The Role of Lexicalization and Pruning for Base Noun Phrase Grammars}},
  author    = {Cardie, Claire and Pierce, David R.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1999},
  pages     = {423-430},
  url       = {https://mlanthology.org/aaai/1999/cardie1999aaai-role/}
}