Neural Networks Incorporating Dictionaries for Chinese Word Segmentation

Abstract

In recent years, deep neural networks have achieved significant success in Chinese word segmentation and many other natural language processing tasks. Most of these algorithms are end-to-end trainable systems and can effectively process and learn from large scale labeled datasets. However, these methods typically lack the capability of processing rare words and data whose domains are different from training data. Previous statistical methods have demonstrated that human knowledge can provide valuable information for handling rare cases and domain shifting problems. In this paper, we seek to address the problem of incorporating dictionaries into neural networks for the Chinese word segmentation task. Two different methods that extend the bi-directional long short-term memory neural network are proposed to perform the task. To evaluate the performance of the proposed methods, state-of-the-art supervised models based methods and domain adaptation approaches are compared with our methods on nine datasets from different domains. The experimental results demonstrate that the proposed methods can achieve better performance than other state-of-the-art neural network methods and domain adaptation approaches in most cases.

Cite

Text

Zhang et al. "Neural Networks Incorporating Dictionaries for Chinese Word Segmentation." AAAI Conference on Artificial Intelligence, 2018. doi:10.1609/AAAI.V32I1.11959

Markdown

[Zhang et al. "Neural Networks Incorporating Dictionaries for Chinese Word Segmentation." AAAI Conference on Artificial Intelligence, 2018.](https://mlanthology.org/aaai/2018/zhang2018aaai-neural/) doi:10.1609/AAAI.V32I1.11959

BibTeX

@inproceedings{zhang2018aaai-neural,
  title     = {{Neural Networks Incorporating Dictionaries for Chinese Word Segmentation}},
  author    = {Zhang, Qi and Liu, Xiaoyu and Fu, Jinlan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {5682-5689},
  doi       = {10.1609/AAAI.V32I1.11959},
  url       = {https://mlanthology.org/aaai/2018/zhang2018aaai-neural/}
}