A Bayesian LDA-Based Model for Semi-Supervised Part-of-Speech Tagging

Abstract

We present a novel Bayesian model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation model and incorporates the intuition that words’ distributions over tags, p(t|w), are sparse. In addition we in- troduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outper- forms the best previously proposed model for this task on a standard dataset.

Cite

Text

Toutanova and Johnson. "A Bayesian LDA-Based Model for Semi-Supervised Part-of-Speech Tagging." Neural Information Processing Systems, 2007.

Markdown

[Toutanova and Johnson. "A Bayesian LDA-Based Model for Semi-Supervised Part-of-Speech Tagging." Neural Information Processing Systems, 2007.](https://mlanthology.org/neurips/2007/toutanova2007neurips-bayesian/)

BibTeX

@inproceedings{toutanova2007neurips-bayesian,
  title     = {{A Bayesian LDA-Based Model for Semi-Supervised Part-of-Speech Tagging}},
  author    = {Toutanova, Kristina and Johnson, Mark},
  booktitle = {Neural Information Processing Systems},
  year      = {2007},
  pages     = {1521-1528},
  url       = {https://mlanthology.org/neurips/2007/toutanova2007neurips-bayesian/}
}