Shan, Zifei

1 publications

ICLR 2025 B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners Weihao Zeng, Yuzhen Huang, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He