Zhu, Banghua

24 publications

ICML 2025 From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and Benchbuilder Pipeline Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica
ICLR 2025 How to Evaluate Reward Models for RLHF Evan Frick, Tianle Li, Connor Chen, Wei-Lin Chiang, Anastasios Nikolas Angelopoulos, Jiantao Jiao, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica
ALT 2025 Noisy Computing of the Threshold Function Ziao Wang, Nadim Ghaddar, Banghua Zhu, Lele Wang
ICLR 2025 Taming Overconfidence in LLMs: Reward Calibration in RLHF Jixuan Leng, Chengsong Huang, Banghua Zhu, Jiaxin Huang
ICML 2024 Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Banghua Zhu, Hao Zhang, Michael Jordan, Joseph E. Gonzalez, Ion Stoica
ICML 2024 Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF Banghua Zhu, Michael Jordan, Jiantao Jiao
ICLR 2024 The Effective Horizon Explains Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
ICLR 2024 Towards the Fundamental Limits of Knowledge Transfer over Finite Domains Qingyue Zhao, Banghua Zhu
NeurIPSW 2023 A Theoretical Explanation of Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
NeurIPSW 2023 A Theoretical Explanation of Deep RL Performance in Stochastic Environments Cassidy Laidlaw, Banghua Zhu, Stuart Russell, Anca Dragan
AISTATS 2023 Byzantine-Robust Federated Learning with Optimal Statistical Rates Banghua Zhu, Lun Wang, Qi Pang, Shuai Wang, Jiantao Jiao, Dawn Song, Michael I. Jordan
NeurIPS 2023 Doubly-Robust Self-Training Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael I. Jordan, Jiantao Jiao
ICML 2023 Jump-Start Reinforcement Learning Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, Sergey Levine, Karol Hausman
NeurIPSW 2023 NexusRaven: A Commercially-Permissive Language Model for Function Calling Venkat Krishna Srinivasan, Zhen Dong, Banghua Zhu, Brian Yu, Damon Mosk-Aoyama, Kurt Keutzer, Jiantao Jiao, Jian Zhang
NeurIPSW 2023 NexusRaven: A Commercially-Permissive Language Model for Function Calling Venkat Krishna Srinivasan, Zhen Dong, Banghua Zhu, Brian Yu, Hanzi Mao, Damon Mosk-Aoyama, Kurt Keutzer, Jiantao Jiao, Jian Zhang
ICML 2023 Online Learning in Stackelberg Games with an Omniscient Follower Geng Zhao, Banghua Zhu, Jiantao Jiao, Michael Jordan
NeurIPSW 2023 Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment Tianhao Wu, Banghua Zhu, Ruoyu Zhang, Zhaojin Wen, Kannan Ramchandran, Jiantao Jiao
ICLRW 2023 Principled Reinforcement Learning with Human Feedback from Pairwise or $k$-Wise Comparisons Banghua Zhu, Jiantao Jiao, Michael Jordan
ICMLW 2023 Principled Reinforcement Learning with Human Feedback from Pairwise or $k$-Wise Comparisons Banghua Zhu, Michael Jordan, Jiantao Jiao
ICML 2023 Principled Reinforcement Learning with Human Feedback from Pairwise or K-Wise Comparisons Banghua Zhu, Michael Jordan, Jiantao Jiao
NeurIPS 2023 Towards Optimal Caching and Model Selection for Large Model Inference Banghua Zhu, Ying Sheng, Lianmin Zheng, Clark Barrett, Michael I. Jordan, Jiantao Jiao
NeurIPSW 2023 Towards Optimal Statistical Watermarking Baihe Huang, Banghua Zhu, Hanlin Zhu, Jason Lee, Jiantao Jiao, Michael Jordan
NeurIPSW 2023 Towards the Fundamental Limits of Knowledge Transfer over Finite Domains Qingyue Zhao, Banghua Zhu
NeurIPS 2021 Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism Paria Rashidinejad, Banghua Zhu, Cong Ma, Jiantao Jiao, Stuart J. Russell