Differentiable Prompt Makes Pre-Trained Language Models Better Few-Shot Learners
Abstract
Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance.
Cite
Text
Zhang et al. "Differentiable Prompt Makes Pre-Trained Language Models Better Few-Shot Learners." International Conference on Learning Representations, 2022.Markdown
[Zhang et al. "Differentiable Prompt Makes Pre-Trained Language Models Better Few-Shot Learners." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/zhang2022iclr-differentiable/)BibTeX
@inproceedings{zhang2022iclr-differentiable,
title = {{Differentiable Prompt Makes Pre-Trained Language Models Better Few-Shot Learners}},
author = {Zhang, Ningyu and Li, Luoqiu and Chen, Xiang and Deng, Shumin and Bi, Zhen and Tan, Chuanqi and Huang, Fei and Chen, Huajun},
booktitle = {International Conference on Learning Representations},
year = {2022},
url = {https://mlanthology.org/iclr/2022/zhang2022iclr-differentiable/}
}