Sun, Ruiyang

4 publications

MLOSS 2024 OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang
ICLR 2024 Safe RLHF: Safe Reinforcement Learning from Human Feedback Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang
NeurIPS 2023 BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset Jiaming Ji, Mickel Liu, Josef Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Chen, Ruiyang Sun, Yizhou Wang, Yaodong Yang
NeurIPS 2023 Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Josef Dai, Yaodong Yang