Synthetic Treebanking for Cross-Lingual Dependency Parsing

Abstract

How do we parse the languages for which no treebanks are available? This contribution addresses the cross-lingual viewpoint on statistical dependency parsing, in which we attempt to make use of resource-rich source language treebanks to build and adapt models for the under-resourced target languages. We outline the benefits, and indicate the drawbacks of the current major approaches. We emphasize synthetic treebanking: the automatic creation of target language treebanks by means of annotation projection and machine translation. We present competitive results in cross-lingual dependency parsing using a combination of various techniques that contribute to the overall success of the method. We further include a detailed discussion about the impact of part-of-speech label accuracy on parsing results that provide guidance in practical applications of cross-lingual methods for truly under-resourced languages.

Cite

Text

Tiedemann and Agic. "Synthetic Treebanking for Cross-Lingual Dependency Parsing." Journal of Artificial Intelligence Research, 2016. doi:10.1613/JAIR.4785

Markdown

[Tiedemann and Agic. "Synthetic Treebanking for Cross-Lingual Dependency Parsing." Journal of Artificial Intelligence Research, 2016.](https://mlanthology.org/jair/2016/tiedemann2016jair-synthetic/) doi:10.1613/JAIR.4785

BibTeX

@article{tiedemann2016jair-synthetic,
  title     = {{Synthetic Treebanking for Cross-Lingual Dependency Parsing}},
  author    = {Tiedemann, Jörg and Agic, Zeljko},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2016},
  pages     = {209-248},
  doi       = {10.1613/JAIR.4785},
  volume    = {55},
  url       = {https://mlanthology.org/jair/2016/tiedemann2016jair-synthetic/}
}