Huang, Haian

1 publications

NeurIPS 2025 Semi-Off-Policy Reinforcement Learning for Vision-Language Slow-Thinking Reasoning Junhao Shen, Haiteng Zhao, Yuzhe Gu, Songyang Gao, Kuikun Liu, Haian Huang, Jianfei Gao, Dahua Lin, Wenwei Zhang, Kai Chen