Grammar-Aligned Decoding
Abstract
Large Language Models (LLMs) struggle with reliably generating highly structured outputs, such as program code, mathematical formulas, or well-formed markup. Constrained decoding approaches mitigate this problem by greedily restricting what tokens an LLM can output at each step to guarantee that the output matches a given constraint. Specifically, in grammar-constrained decoding (GCD), the LLM's output must follow a given grammar. In this paper we demonstrate that GCD techniques (and in general constrained decoding techniques) can distort the LLM's distribution, leading to outputs that are grammatical but appear with likelihoods that are not proportional to the ones given by the LLM, and so ultimately are low-quality. We call the problem of aligning sampling with a grammar constraint, grammar-aligned decoding (GAD), and propose adaptive sampling with approximate expected futures (ASAp), a decoding algorithm that guarantees the output to be grammatical while provably producing outputs that match the conditional probability of the LLM's distribution conditioned on the given grammar constraint. Our algorithm uses prior sample outputs to soundly overapproximate the future grammaticality of different output prefixes. Our evaluation on code generation and structured NLP tasks shows how ASAp often produces outputs with higher likelihood (according to the LLM's distribution) than existing GCD techniques, while still enforcing the desired grammatical constraints.
Cite
Text
Park et al. "Grammar-Aligned Decoding." Neural Information Processing Systems, 2024. doi:10.52202/079017-0774Markdown
[Park et al. "Grammar-Aligned Decoding." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/park2024neurips-grammaraligned/) doi:10.52202/079017-0774BibTeX
@inproceedings{park2024neurips-grammaraligned,
title = {{Grammar-Aligned Decoding}},
author = {Park, Kanghee and Wang, Jiayu and Berg-Kirkpatrick, Taylor and Polikarpova, Nadia and D'Antoni, Loris},
booktitle = {Neural Information Processing Systems},
year = {2024},
doi = {10.52202/079017-0774},
url = {https://mlanthology.org/neurips/2024/park2024neurips-grammaraligned/}
}