Decision Preference Alignment for Large-Scale Agents Based on Reward Model Generation
Abstract
This paper presents a novel data generation method for large scale agents' decision preference alignment. Despite the recent increasing attention to AI alignment, machine learning approaches for AI alignment are still challenging due to the lack of data. Trajectory data representing agent behavior is essential in various alignment methods such as Reinforcement Learning from Human Feedback(RLHF) or Inverse Reinforcement Learning(IRL). In this paper, we significantly reduces the dependence on trajectory data. Our method uses a generative method to shift the focus from learning about the reward model for alignment to learning how to generate sample data for alignment. Therefore, our method broadens the scope of data needed for alignment to include both microscopic and macroscopic information that can be obtained. By designing detailed macro and micro metrics, it is verified that the simulation results of passenger boarding process based on generated decision preferences match well with those guided by ground truth decision preferences.
Cite
Text
Jiaoling et al. "Decision Preference Alignment for Large-Scale Agents Based on Reward Model Generation." ICLR 2025 Workshops: Bi-Align, 2025.Markdown
[Jiaoling et al. "Decision Preference Alignment for Large-Scale Agents Based on Reward Model Generation." ICLR 2025 Workshops: Bi-Align, 2025.](https://mlanthology.org/iclrw/2025/jiaoling2025iclrw-decision/)BibTeX
@inproceedings{jiaoling2025iclrw-decision,
title = {{Decision Preference Alignment for Large-Scale Agents Based on Reward Model Generation}},
author = {Jiaoling, Zheng and Weifeng, Xu and Qian, Luo and Wanli, Dang and Long, Geng and Guokang, Gao and Yulin, Ren and Xingyu, Fan},
booktitle = {ICLR 2025 Workshops: Bi-Align},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/jiaoling2025iclrw-decision/}
}