Low-Resource NER by Data Augmentation with Prompting

Abstract

Named entity recognition (NER) is a fundamental information extraction task that seeks to identify entity mentions of certain types in text. Despite numerous advances, the existing NER methods rely on extensive supervision for model training, which struggle in a low-resource scenario with limited training data. In this paper, we propose a new data augmentation method for low-resource NER, by eliciting knowledge from BERT with prompting strategies. Particularly, we devise a label-conditioned word replacement strategy that can produce more label-consistent examples by capturing the underlying word-label dependencies, and a prompting with question answering method to generate new training data from unlabeled texts. The experimental results have widely confirmed the effectiveness of our approach. Particularly, in a low-resource scenario with only 150 training sentences, our approach outperforms previous methods without data augmentation by over 40% in F1 and prior best data augmentation methods by over 2.0% in F1. Furthermore, our approach also fits with a zero-shot scenario, yielding promising results without using any human-labeled data for the task.

Cite

Text

Liu et al. "Low-Resource NER by Data Augmentation with Prompting." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/590

Markdown

[Liu et al. "Low-Resource NER by Data Augmentation with Prompting." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/liu2022ijcai-low/) doi:10.24963/IJCAI.2022/590

BibTeX

@inproceedings{liu2022ijcai-low,
  title     = {{Low-Resource NER by Data Augmentation with Prompting}},
  author    = {Liu, Jian and Chen, Yufeng and Xu, Jinan},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {4252-4258},
  doi       = {10.24963/IJCAI.2022/590},
  url       = {https://mlanthology.org/ijcai/2022/liu2022ijcai-low/}
}