Keep the Best, Forget the REST: Reliable Alignment with Order-Aware Preference Optimization

Abstract

Direct Preference Optimization (DPO) has emerged as a powerful framework for aligning large language models (LLMs) with human preferences via pairwise comparisons. However, its performance is highly sensitive to the quality of training samples: when the reference policy is poorly aligned with human preferences, ambiguous pairs can dominate the gradient signal and degrade generalization. To address this, we propose RAPPO($\textbf{R}$eliable $\textbf{A}$lignment for $\textbf{P}$reference $\textbf{P}$olicy $\textbf{O}$ptimization), a simple sample-aware modification of the DPO loss that mitigates reference-policy misalignment by filtering out the hardest, most ambiguous samples. We theoretically show that RAPPO yields improved generalization guarantees. RAPPO is lightweight and requires only a few lines of code to be integrated into any existing DPO-type algorithm. Surprisingly, With this simple modification, our simulations across a broad suite of alignment tasks and benchmarks show consistent gains over DPO and recent state-of-the-art baselines. On the PKU-SafeRLHF benchmark, RAPPO attains helpfulness $0.693$ ($+34.8\%$ over DPO) and harmlessness $0.357$ ($-21.0\%$ vs DPO).

Cite

Text

Zhu et al. "Keep the Best, Forget the REST: Reliable Alignment with Order-Aware Preference Optimization." International Conference on Learning Representations, 2026.

Markdown

[Zhu et al. "Keep the Best, Forget the REST: Reliable Alignment with Order-Aware Preference Optimization." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhu2026iclr-keep/)

BibTeX

@inproceedings{zhu2026iclr-keep,
  title     = {{Keep the Best, Forget the REST: Reliable Alignment with Order-Aware Preference Optimization}},
  author    = {Zhu, Jiahui and Shi, Yuanjie and Peng, Xiyue and Liu, Xin and Yan, Yan and Wei, Honghao},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhu2026iclr-keep/}
}