Xu, Xinbo

1 publications

ICLR 2024 Safe RLHF: Safe Reinforcement Learning from Human Feedback Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang