Joint POS Tagging and Text Normalization for Informal Text

Abstract

Text normalization and part-of-speech (POS) tagging for social media data have been investigated recently, however, prior work has treated them separately. In this paper, we propose a joint Viterbi decoding process to determine each token’s POS tag and non-standard token’s correct form at the same time. In order to evaluate our approach, we create two new data sets with POS tag labels and non-standard tokens' correct forms. This is the first data set with such annotation. The experiment results demonstrate the effect of non-standard words on POS tagging, and also show that our proposed methods perform better than the state-of-the-art systems in both POS tagging and normalization.

Cite

Text

Li and Liu. "Joint POS Tagging and Text Normalization for Informal Text." International Joint Conference on Artificial Intelligence, 2015.

Markdown

[Li and Liu. "Joint POS Tagging and Text Normalization for Informal Text." International Joint Conference on Artificial Intelligence, 2015.](https://mlanthology.org/ijcai/2015/li2015ijcai-joint/)

BibTeX

@inproceedings{li2015ijcai-joint,
  title     = {{Joint POS Tagging and Text Normalization for Informal Text}},
  author    = {Li, Chen and Liu, Yang},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2015},
  pages     = {1263-1269},
  url       = {https://mlanthology.org/ijcai/2015/li2015ijcai-joint/}
}