Teaching Language Models to Critique via Reinforcement Learning
Abstract
Teaching large language models (LLMs) to critique and refine their outputs is crucial for building systems that can iteratively improve, yet it is fundamentally limited by the ability to provide accurate judgments and actionable suggestions. In this work, we study LLM critics for code generation and propose $\texttt{CTRL}$, a framework for $\texttt{C}$ritic $\texttt{T}$raining via $\texttt{R}$einforcement $\texttt{L}$earning, which trains a critic model to generate feedback that maximizes correction performance for a fixed generator model without human supervision. Our results demonstrate that critics trained with $\texttt{CTRL}$ significantly enhance pass rates and mitigate compounding errors across both base and stronger generator models. Furthermore, we show that these critic models act as accurate generative reward models and enable test-time scaling through iterative critique-revision, achieving up to 106.1% relative improvements across challenging code generation benchmarks.
Cite
Text
Xie et al. "Teaching Language Models to Critique via Reinforcement Learning." ICLR 2025 Workshops: DL4C, 2025.Markdown
[Xie et al. "Teaching Language Models to Critique via Reinforcement Learning." ICLR 2025 Workshops: DL4C, 2025.](https://mlanthology.org/iclrw/2025/xie2025iclrw-teaching/)BibTeX
@inproceedings{xie2025iclrw-teaching,
title = {{Teaching Language Models to Critique via Reinforcement Learning}},
author = {Xie, Zhihui and Chen, Jie and Chen, Liyu and Mao, Weichao and Xu, Jingjing and Kong, Lingpeng},
booktitle = {ICLR 2025 Workshops: DL4C},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/xie2025iclrw-teaching/}
}