Yan, Kaizhuo

2 publications

ICLR 2026 OPPO: Accelerating PPO-Based RLHF via Pipeline Overlap Kaizhuo Yan, YingJie Yu, Yifan Yu, Haizhong Zheng, Fan Lai
ICLR 2026 VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use Mingyuan Wu, Jingcheng Yang, Jize Jiang, Meitang Li, Kaizhuo Yan, Hanchao Yu, Minjia Zhang, ChengXiang Zhai, Klara Nahrstedt