Corrector Sampling in Language Models

Abstract

Autoregressive language models accumulate errors due to their fixed, irrevocable left-to-right token generation. To address this, we propose a new sampling method called Resample-Previous-Tokens (RPT). RPT mitigates error accumulation by iteratively revisiting and potentially replacing tokens in a window of previously generated text. Fine-tuning a pretrained 8B parameter model with RPT for only 100B resulted in ~10% relative improvements on reasoning and coding benchmarks compared to the standard sampling.

Cite

Text

Gat et al. "Corrector Sampling in Language Models." Advances in Neural Information Processing Systems, 2025.

Markdown

[Gat et al. "Corrector Sampling in Language Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/gat2025neurips-corrector/)

BibTeX

@inproceedings{gat2025neurips-corrector,
  title     = {{Corrector Sampling in Language Models}},
  author    = {Gat, Itai and Shaul, Neta and Singer, Uriel and Lipman, Yaron},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/gat2025neurips-corrector/}
}