Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Abstract
Iterative preference optimization has recently become one of the de-facto training paradigms for large language models (LLMs), but the performance is still underwhelming due to too much noisy preference data yielded in the loop. To combat this issue, we present an Uncertainty-enhanced Preference Optimization (UPO) framework to make the LLM self-evolve with reliable feedback. The key idea is mitigating the noisy preference pairs derived from the current policy and reward models by performing pair-wise uncertainty estimation and judiciously reliable feedback sampling. To reach this goal, we thus introduce an estimator model, which incorporates Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation for the batch of preference pairs. Compared to the existing methods that directly filter generated responses based on the reward score, the estimator focuses on the model uncertainty in a pair-wise manner and effectively bypasses the confirmation bias problem of the reward model. Additionally, we also propose an uncertainty-enhanced self-evolution algorithm to better improve the LLM robustly align with these reliable feedback data. Extensive experiments over multiple benchmarks demonstrate our framework substantially improves the performance of iterative preference optimization.
Cite
Text
Wang et al. "Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I24.34724Markdown
[Wang et al. "Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wang2025aaai-self/) doi:10.1609/AAAI.V39I24.34724BibTeX
@inproceedings{wang2025aaai-self,
title = {{Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization}},
author = {Wang, Jianing and Zhou, Yang and Zhang, Xiaocheng and Bao, Mengjiao and Yan, Peng},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {25362-25370},
doi = {10.1609/AAAI.V39I24.34724},
url = {https://mlanthology.org/aaai/2025/wang2025aaai-self/}
}