Prompt Sketching for Large Language Models

Abstract

Many recent prompting strategies for large language models (LLMs) query the model multiple times sequentially – first to produce intermediate results and then the final answer. However, using these methods, both decoder and model are unaware of potential follow-up prompts, leading to disconnected and undesirably wordy intermediate responses. In this work, we address this issue by proposing prompt sketching, a new prompting paradigm in which an LLM does not only respond by completing a prompt, but by predicting values for multiple variables in a template. This way, sketching grants users more control over the generation process, e.g., by providing a reasoning framework via intermediate instructions, leading to better overall results. The key idea enabling sketching with existing, autoregressive models is to adapt the decoding procedure to also score follow-up instructions during text generation, thus optimizing overall template likelihood in inference. Our experiments show that in a zero-shot setting, prompt sketching outperforms existing, sequential prompting schemes such as direct asking or chain-of-thought on 7 out of 8 LLM benchmarking tasks, including state tracking, arithmetic reasoning, and general question answering. To facilitate future use, we release a number of generic, yet effective sketches applicable to many tasks, and an open source library called dclib, powering our sketch-aware decoders as part of https://github.com/eth-sri/lmql.

Cite

Text

Beurer-Kellner et al. "Prompt Sketching for Large Language Models." International Conference on Machine Learning, 2024.

Markdown

[Beurer-Kellner et al. "Prompt Sketching for Large Language Models." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/beurerkellner2024icml-prompt/)

BibTeX

@inproceedings{beurerkellner2024icml-prompt,
  title     = {{Prompt Sketching for Large Language Models}},
  author    = {Beurer-Kellner, Luca and Mueller, Mark Niklas and Fischer, Marc and Vechev, Martin},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {3674-3706},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/beurerkellner2024icml-prompt/}
}