Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization

Wang, Jianing; Zhou, Yang; Zhang, Xiaocheng; Bao, Mengjiao; Yan, Peng

doi:10.1609/AAAI.V39I24.34724

Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization

Jianing Wang, Yang Zhou, Xiaocheng Zhang, Mengjiao Bao, Peng Yan

AAAI 2025 pp. 25362-25370

doi:10.1609/AAAI.V39I24.34724 /aaai/2025/wang2025aaai-self/

Abstract

Iterative preference optimization has recently become one of the de-facto training paradigms for large language models (LLMs), but the performance is still underwhelming due to too much noisy preference data yielded in the loop. To combat this issue, we present an Uncertainty-enhanced Preference Optimization (UPO) framework to make the LLM self-evolve with reliable feedback. The key idea is mitigating the noisy preference pairs derived from the current policy and reward models by performing pair-wise uncertainty estimation and judiciously reliable feedback sampling. To reach this goal, we thus introduce an estimator model, which incorporates Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation for the batch of preference pairs. Compared to the existing methods that directly filter generated responses based on the reward score, the estimator focuses on the model uncertainty in a pair-wise manner and effectively bypasses the confirmation bias problem of the reward model. Additionally, we also propose an uncertainty-enhanced self-evolution algorithm to better improve the LLM robustly align with these reliable feedback data. Extensive experiments over multiple benchmarks demonstrate our framework substantially improves the performance of iterative preference optimization.

PDF AAAI Semantic Scholar

Cite

Text

Wang et al. "Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I24.34724

Markdown

[Wang et al. "Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-self/) doi:10.1609/AAAI.V39I24.34724

BibTeX

@inproceedings{wang2025aaai-self,
  title     = {{Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization}},
  author    = {Wang, Jianing and Zhou, Yang and Zhang, Xiaocheng and Bao, Mengjiao and Yan, Peng},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {25362-25370},
  doi       = {10.1609/AAAI.V39I24.34724},
  url       = {https://mlanthology.org/aaai/2025/wang2025aaai-self/}
}