Zhang, Zizhuo

3 publications

ICLR 2026 Co-Rewarding: Stable Self-Supervised RL for Eliciting Reasoning in Large Language Models Zizhuo Zhang, Jianing Zhu, Xinmu Ge, Zihua Zhao, Zhanke Zhou, Xuan Li, Xiao Feng, Jiangchao Yao, Bo Han
ICLR 2026 Towards Understanding Valuable Preference Data for Large Language Model Alignment Zizhuo Zhang, Qizhou Wang, Shanshan Ye, Jianing Zhu, Jiangchao Yao, Bo Han, Masashi Sugiyama
ICLR 2025 Fast and Accurate Blind Flexible Docking Zizhuo Zhang, Lijun Wu, Kaiyuan Gao, Jiangchao Yao, Tao Qin, Bo Han