Zhang, Xinnan

2 publications

ICLRW 2025 Reinforcement Learning in Inference Time: A Perspective from Successive Policy Iterations Xinnan Zhang, Chenliang Li, Siliang Zeng, Jiaxiang Li, Zhongruo Wang, Songtao Lu, Alfredo Garcia, Mingyi Hong
NeurIPSW 2024 LLM Alignment Through Successive Policy Re-Weighting (SPR) Xinnan Zhang, Siliang Zeng, Jiaxiang Li, Kaixiang Lin, Mingyi Hong