Towards Incremental NER Data Augmentation via Syntactic-Aware Insertion Transformer
Abstract
Named entity recognition (NER) aims to locate and classify named entities in natural language texts. Most existing high-performance NER models employ a supervised paradigm, which requires a large quantity of high-quality annotated data during training. In order to help NER models perform well in few-shot scenarios, data augmentation approaches attempt to build extra data by means of random editing or by using end-to-end generation with PLMs. However, these methods focus on only the fluency of generated sentences, ignoring the syntactic correlation between the new and raw sentences. Such uncorrelation also brings low diversity and inconsistent labeling of synthetic samples. To fill this gap, we present SAINT (Syntactic-Aware InsertioN Transformer), a hard-constraint controlled text generation model that incorporates syntactic information. The proposed method operates by inserting new tokens between existing entities in a parallel manner. During insertion procedure, new tokens will be added taking both semantic and syntactic factors into account. Hence the resulting sentence can retain the syntactic correctness with respect to the raw data. Experimental results on two benchmark datasets, i.e., Ontonotes and Wikiann, demonstrate the comparable performance of SAINT over the state-of-the-art baselines.
Cite
Text
Ke et al. "Towards Incremental NER Data Augmentation via Syntactic-Aware Insertion Transformer." International Joint Conference on Artificial Intelligence, 2023. doi:10.24963/IJCAI.2023/567Markdown
[Ke et al. "Towards Incremental NER Data Augmentation via Syntactic-Aware Insertion Transformer." International Joint Conference on Artificial Intelligence, 2023.](https://mlanthology.org/ijcai/2023/ke2023ijcai-incremental/) doi:10.24963/IJCAI.2023/567BibTeX
@inproceedings{ke2023ijcai-incremental,
title = {{Towards Incremental NER Data Augmentation via Syntactic-Aware Insertion Transformer}},
author = {Ke, Wenjun and Tian, Zongkai and Liu, Qi and Wang, Peng and Gao, Jinhua and Qi, Rui},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2023},
pages = {5104-5112},
doi = {10.24963/IJCAI.2023/567},
url = {https://mlanthology.org/ijcai/2023/ke2023ijcai-incremental/}
}