Unsupervised Estimation for Noisy-Channel Models

Abstract

Shannon’s Noisy-Channel model, which describes how a corrupted message might be reconstructed, has been the corner stone for much work in statistical language and speech processing. The model factors into two components: a language model to characterize the original message and a channel model to describe the channel’s corruptive process. The standard approach for estimating the parameters of the channel model is unsupervised Maximum-Likelihood of the observation data, usually approximated using the Expectation-Maximization (EM) algorithm. In this paper we show that it is better to maximize the joint likelihood of the data at both ends of the noisy-channel. We derive a corresponding bi-directional EM algorithm and show that it gives better performance than standard EM on two tasks: (1) translation using a probabilistic lexicon and (2) adaptation of a part-of-speech tagger between related languages.

Cite

Text

Mylonakis et al. "Unsupervised Estimation for Noisy-Channel Models." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273580

Markdown

[Mylonakis et al. "Unsupervised Estimation for Noisy-Channel Models." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/mylonakis2007icml-unsupervised/) doi:10.1145/1273496.1273580

BibTeX

@inproceedings{mylonakis2007icml-unsupervised,
  title     = {{Unsupervised Estimation for Noisy-Channel Models}},
  author    = {Mylonakis, Markos and Sima'an, Khalil and Hwa, Rebecca},
  booktitle = {International Conference on Machine Learning},
  year      = {2007},
  pages     = {665-672},
  doi       = {10.1145/1273496.1273580},
  url       = {https://mlanthology.org/icml/2007/mylonakis2007icml-unsupervised/}
}