References Improve LLM Alignment in Non-Verifiable Domains

Shi, Kejian; Liu, Yixin; Wang, PeiFeng; Fabbri, Alexander; Joty, Shafiq; Cohan, Arman

References Improve LLM Alignment in Non-Verifiable Domains

Kejian Shi, Yixin Liu, PeiFeng Wang, Alexander Fabbri, Shafiq Joty, Arman Cohan

ICLR 2026

/iclr/2026/shi2026iclr-references/

Abstract

While Reinforcement Learning with Verifiable Rewards (RLVR) has shown strong effectiveness in reasoning tasks, it cannot be directly applied to non-verifiable domains lacking ground-truth verifiers, such as LLM alignment. In this work, we investigate whether high-quality reference outputs can be effectively leveraged to bridge this gap. First, we design evaluation protocols that enhance LLM-based evaluators for LLM alignment using reference outputs. Through comprehensive experiments, we show that a reference-guided approach substantially improves the accuracy of less capable LLM-judges using references from frontier models; stronger LLM-judges can also be enhanced by human-written references. We then demonstrate the utility of high-quality references in alignment tuning, where LLMs guided with references are used as judges to self-improve. We show that reference-guided self-improvement yields clear gains over both SFT distillation and reference-free baselines, achieving performance comparable to training with finetuned reward models. Specifically, our method achieves scores of 73.1% and 58.7% on AlpacaEval and Arena-Hard with Llama-3-8B-Instruct, and 70.0% and 74.1% with Qwen2.5-7B. These results highlight the potential of using reference-guided LLM-evaluators to enable effective post-training in non-verifiable domains.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Shi et al. "References Improve LLM Alignment in Non-Verifiable Domains." International Conference on Learning Representations, 2026.

Markdown

[Shi et al. "References Improve LLM Alignment in Non-Verifiable Domains." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/shi2026iclr-references/)

BibTeX

@inproceedings{shi2026iclr-references,
  title     = {{References Improve LLM Alignment in Non-Verifiable Domains}},
  author    = {Shi, Kejian and Liu, Yixin and Wang, PeiFeng and Fabbri, Alexander and Joty, Shafiq and Cohan, Arman},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/shi2026iclr-references/}
}