Black-Box Tuning for Language-Model-as-a-Service

Abstract

Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service. It allows users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable. Can we optimize the task prompts by only accessing the model inference APIs? This paper proposes the black-box tuning framework to optimize the continuous prompt prepended to the input text via derivative-free optimization. Instead of optimizing in the original high-dimensional prompt space, which is intractable for traditional derivative-free optimization, we perform optimization in a randomly generated subspace due to the low intrinsic dimensionality of large PTMs. The experimental results show that the black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3’s in-context learning, but also surpasses the gradient-based counterparts, i.e., prompt tuning and full model tuning.

Cite

Text

Sun et al. "Black-Box Tuning for Language-Model-as-a-Service." International Conference on Machine Learning, 2022.

Markdown

[Sun et al. "Black-Box Tuning for Language-Model-as-a-Service." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/sun2022icml-blackbox/)

BibTeX

@inproceedings{sun2022icml-blackbox,
  title     = {{Black-Box Tuning for Language-Model-as-a-Service}},
  author    = {Sun, Tianxiang and Shao, Yunfan and Qian, Hong and Huang, Xuanjing and Qiu, Xipeng},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {20841-20855},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/sun2022icml-blackbox/}
}