Chen, Hardy

1 publications

TMLR 2025 SFT or RL? an Early Investigation into Training R1-like Reasoning Large Vision-Language Models Hardy Chen, Haoqin Tu, Fali Wang, Hui Liu, Xianfeng Tang, Xinya Du, Yuyin Zhou, Cihang Xie