Zhan, Simon Sinong

7 publications

ICLR 2026 Belief-Based Offline Reinforcement Learning for Delay-Robust Policy Optimization Simon Sinong Zhan, Qingyuan Wu, Philip Wang, Frank Yang, Xiangyu Shi, Chao Huang, Qi Zhu
ICLR 2026 Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning Yimeng Zhang, Tian Wang, Jiri Gesi, Ziyi Wang, Yuxuan Lu, Jiacheng Lin, Simon Sinong Zhan, Vianne R. Gao, Ruochen Jiao, Junze Liu, Kun Qian, Yuxin Tang, Ran Xue, Houyu Zhang, Qingjun Cui, Yufan Guo, Dakuo Wang
ICML 2025 Directly Forecasting Belief for Reinforcement Learning with Delays Qingyuan Wu, Yuhui Wang, Simon Sinong Zhan, Yixuan Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Jürgen Schmidhuber, Chao Huang
ICML 2024 Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Jürgen Schmidhuber, Chao Huang
ICLRW 2024 Empowering Autonomous Driving with Large Language Models: A Safety Perspective Yixuan Wang, Ruochen Jiao, Simon Sinong Zhan, Chengtian Lang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu
NeurIPS 2024 Variational Delayed Policy Optimization Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Chao Huang
ICML 2023 Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments Yixuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu