Zhao, Andrew

10 publications

NeurIPS 2025 Absolute Zero: Reinforced Self-Play Reasoning with Zero Data Andrew Zhao, Yiran Wu, Yang Yue, Tong Wu, Quentin Xu, Yang Yue, Matthieu Lin, Shenzhi Wang, Qingyun Wu, Zilong Zheng, Gao Huang
NeurIPS 2025 Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Shenzhi Wang, Le Yu, Chang Gao, Chujie Zheng, Shixuan Liu, Rui Lu, Kai Dang, Xiong-Hui Chen, Jianxin Yang, Zhenru Zhang, Yuqiong Liu, An Yang, Andrew Zhao, Yang Yue, Shiji Song, Bowen Yu, Gao Huang, Junyang Lin
AAAI 2025 DiveR-CT: Diversity-Enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints Andrew Zhao, Quentin Xu, Matthieu Lin, Shenzhi Wang, Yong-Jin Liu, Zilong Zheng, Gao Huang
NeurIPS 2025 Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang
ICMLW 2024 DiveR-CT: Diversity-Enhanced Red Teaming with Relaxing Constraints Andrew Zhao, Quentin Xu, Matthieu Lin, Shenzhi Wang, Yong-jin Liu, Zilong Zheng, Gao Huang
AAAI 2024 ExpeL: LLM Agents Are Experiential Learners Andrew Zhao, Daniel Huang, Quentin Xu, Matthieu Lin, Yong-Jin Liu, Gao Huang
AAAI 2024 Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation Matthieu Lin, Jenny Sheng, Yubin Hu, Yangguang Li, Lu Qi, Andrew Zhao, Gao Huang, Yong-Jin Liu
CVPRW 2024 Exploring Text-to-Motion Generation with Human Preference Jenny Sheng, Matthieu Lin, Andrew Zhao, Kevin Pruvost, Yu-Hui Wen, Yangguang Li, Gao Huang, Yong-Jin Liu
NeurIPS 2022 A Mixture of Surprises for Unsupervised Reinforcement Learning Andrew Zhao, Matthieu Lin, Yangguang Li, Yong-jin Liu, Gao Huang
NeurIPS 2022 Provable General Function Class Representation Learning in Multitask Bandits and MDP Rui Lu, Andrew Zhao, Simon S Du, Gao Huang