Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing
Abstract
Although state-of-the-art parsers for natural language are lexicalized, it was recently shown that an accurate unlexicalized parser for the Penn tree-bank can be simply read off a manually refined tree-bank. While lexicalized parsers often suffer from sparse data, manual mark-up is costly and largely based on individual linguistic intuition. Thus, across domains, languages, and tree-bank annotations, a fundamental question arises: Is it possible to automatically induce an accurate parser from a tree-bank without resorting to full lexicalization? In this paper, we show how to induce a probabilistic parser with latent head information from simple linguistic principles. Our parser has a performance of 85.1% (LP/LR F_1), which is as good as that of early lexicalized ones. This is remarkable since the induction of probabilistic grammars is in general a hard task.
Cite
Text
Prescher. "Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing." European Conference on Machine Learning, 2005. doi:10.1007/11564096_30Markdown
[Prescher. "Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing." European Conference on Machine Learning, 2005.](https://mlanthology.org/ecmlpkdd/2005/prescher2005ecml-inducing/) doi:10.1007/11564096_30BibTeX
@inproceedings{prescher2005ecml-inducing,
title = {{Inducing Head-Driven PCFGs with Latent Heads: Refining a Tree-Bank Grammar for Parsing}},
author = {Prescher, Detlef},
booktitle = {European Conference on Machine Learning},
year = {2005},
pages = {292-304},
doi = {10.1007/11564096_30},
url = {https://mlanthology.org/ecmlpkdd/2005/prescher2005ecml-inducing/}
}