ML Anthology
Authors
Search
About
Huo, Mingyue
1 publications
ICLR
2025
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Yuheng Zhang
,
Dian Yu
,
Baolin Peng
,
Linfeng Song
,
Ye Tian
,
Mingyue Huo
,
Nan Jiang
,
Haitao Mi
,
Dong Yu