Distributional Learning of Parallel Multiple Context-Free Grammars
Abstract
Natural languages require grammars beyond context-free for their description. Here we extend a family of distributional learning algorithms for context-free grammars to the class of Parallel Multiple Context-Free Grammars ( pmcfg s). These grammars have two additional operations beyond the simple context-free operation of concatenation: the ability to interleave strings of symbols, and the ability to copy or duplicate strings. This allows the grammars to generate some non-semilinear languages, which are outside the class of mildly context-sensitive grammars. These grammars, if augmented with a suitable feature mechanism, are capable of representing all of the syntactic phenomena that have been claimed to exist in natural language. We present a learning algorithm for a large subclass of these grammars, that includes all regular languages but not all context-free languages. This algorithm relies on a generalisation of the notion of distribution as a function from tuples of strings to entire sentences; we define nonterminals using finite sets of these functions. Our learning algorithm uses a nonprobabilistic learning paradigm which allows for membership queries as well as positive samples; it runs in polynomial time.
Cite
Text
Clark and Yoshinaka. "Distributional Learning of Parallel Multiple Context-Free Grammars." Machine Learning, 2014. doi:10.1007/S10994-013-5403-2Markdown
[Clark and Yoshinaka. "Distributional Learning of Parallel Multiple Context-Free Grammars." Machine Learning, 2014.](https://mlanthology.org/mlj/2014/clark2014mlj-distributional/) doi:10.1007/S10994-013-5403-2BibTeX
@article{clark2014mlj-distributional,
title = {{Distributional Learning of Parallel Multiple Context-Free Grammars}},
author = {Clark, Alexander and Yoshinaka, Ryo},
journal = {Machine Learning},
year = {2014},
pages = {5-31},
doi = {10.1007/S10994-013-5403-2},
volume = {96},
url = {https://mlanthology.org/mlj/2014/clark2014mlj-distributional/}
}