Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora

Abstract

We introduce (1) a novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and (2) the concept of bilingual parsing with potential application to a variety of parallel corpus analysis problems. The formalism combines three tactics against the constraints that render finite-state transducers less useful: it skips directly to a context-free rather than finite-state base, it permits a minimal extra degree of ordering flexibility, and its probabilistic formulation admits an efficient maximum-likelihood bilingual parsing algorithm. A convenient normal form is shown to exist, and we discuss a number of examples of how stochastic inversion transduction grammars bring bilingual constraints to bear upon problematic corpus analysis tasks.

Cite

Text

Wu. "Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora." International Joint Conference on Artificial Intelligence, 1995.

Markdown

[Wu. "Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora." International Joint Conference on Artificial Intelligence, 1995.](https://mlanthology.org/ijcai/1995/wu1995ijcai-stochastic/)

BibTeX

@inproceedings{wu1995ijcai-stochastic,
  title     = {{Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora}},
  author    = {Wu, Dekai},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1995},
  pages     = {1328-1337},
  url       = {https://mlanthology.org/ijcai/1995/wu1995ijcai-stochastic/}
}