Shi, Chengshuai

18 publications

UAI 2025 Augmenting Online RL with Offline Data Is All You Need: A Unified Hybrid RL Algorithm Design and Analysis Ruiquan Huang, Donghao Li, Chengshuai Shi, Cong Shen, Jing Yang
ICLR 2025 Building Math Agents with Multi-Turn Iterative Preference Learning Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu
AISTATS 2025 Cost-Aware Optimal Pairwise Pure Exploration Di Wu, Chengshuai Shi, Ruida Zhou, Cong Shen
NeurIPS 2025 Greedy Sampling Is Provably Efficient for RLHF Di Wu, Chengshuai Shi, Jing Yang, Cong Shen
NeurIPS 2024 Efficient Prompt Optimization Through the Lens of Best Arm Identification Chengshuai Shi, Kun Yang, Zihan Chen, Jundong Li, Jing Yang, Cong Shen
TMLR 2024 Harnessing the Power of Federated Learning in Federated Contextual Bandits Chengshuai Shi, Ruida Zhou, Kun Yang, Cong Shen
NeurIPS 2024 Mixture of Demonstrations for In-Context Learning Song Wang, Zihan Chen, Chengshuai Shi, Cong Shen, Jundong Li
NeurIPS 2024 Transformers as Game Players: Provable In-Context Game-Playing Capabilities of Pre-Trained Models Chengshuai Shi, Kun Yang, Jing Yang, Cong Shen
NeurIPSW 2023 Harnessing the Power of Federated Learning in Federated Contextual Bandits Chengshuai Shi, Kun Yang, Ruida Zhou, Cong Shen
ICLR 2023 Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Liwei Wang, Tong Zhang
ICML 2023 Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang
ICML 2022 A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Tong Zhang
ICLRW 2022 A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games Wei Xiong, Han Zhong, Chengshuai Shi, Cong Shen, Tong Zhang
AISTATS 2021 Federated Multi-Armed Bandits with Personalization Chengshuai Shi, Cong Shen, Jing Yang
NeurIPS 2021 (Almost) Free Incentivized Exploration from Decentralized Learning Agents Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen
AAAI 2021 Federated Multi-Armed Bandits Chengshuai Shi, Cong Shen
NeurIPS 2021 Heterogeneous Multi-Player Multi-Armed Bandits: Closing the Gap and Generalization Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang
AISTATS 2020 Decentralized Multi-Player Multi-Armed Bandits with No Collision Information Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang