Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Abstract

Most semi-supervised methods in Natural Language Processing capitalize on unannotated resources in a single language; however, information can be gained from using parallel resources in more than one language, since translations of the same utterance in different languages can help to disambiguate each other. We demonstrate a method that makes effective use of vast amounts of bilingual text (a.k.a. bitext) to improve monolingual systems. We propose a factored probabilistic sequence model that encourages both crosslanguage and intra-document consistency. A simple Gibbs sampling algorithm is introduced for performing approximate inference. Experiments on English-Chinese Named Entity Recognition (NER) using the OntoNotes dataset demonstrate that our method is significantly more accurate than state-ofthe- art monolingual CRF models in a bilingual test setting. Our model also improves on previous work by Burkett et al. (2010), achieving a relative error reduction of 10.8% and 4.5% in Chinese and English, respectively. Furthermore, by annotating a moderate amount of unlabeled bi-text with our bilingual model, and using the tagged data for uptraining, we achieve a 9.2% error reduction in Chinese over the state-ofthe- art Stanford monolingual NER system.

Cite

Text

Wang et al. "Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers." AAAI Conference on Artificial Intelligence, 2013. doi:10.1609/AAAI.V27I1.8617

Markdown

[Wang et al. "Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers." AAAI Conference on Artificial Intelligence, 2013.](https://mlanthology.org/aaai/2013/wang2013aaai-effective/) doi:10.1609/AAAI.V27I1.8617

BibTeX

@inproceedings{wang2013aaai-effective,
  title     = {{Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers}},
  author    = {Wang, Mengqiu and Che, Wanxiang and Manning, Christopher D.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2013},
  pages     = {919-925},
  doi       = {10.1609/AAAI.V27I1.8617},
  url       = {https://mlanthology.org/aaai/2013/wang2013aaai-effective/}
}