Chen, Guoxi

1 publications

TMLR 2026 A Tighter Bound for Reward Learning in Reinforcement Learning from Human Feedback Guoxi Chen, Xing Chen, Bo An, Ya Zhang