Qiu, Shuang

22 publications

AAAI 2025 Forward KL Regularized Preference Optimization for Aligning Diffusion Policies Zhao Shan, Chenyou Fan, Shuang Qiu, Jiyuan Shi, Chenjia Bai
ICLR 2025 Online Preference Alignment for Language Models via Count-Based Exploration Chenjia Bai, Yang Zhang, Shuang Qiu, Qiaosheng Zhang, Kang Xu, Xuelong Li
ICML 2025 ROPO: Robust Preference Optimization for Large Language Models Xize Liang, Chao Chen, Shuang Qiu, Jie Wang, Yue Wu, Zhihang Fu, Hanzhu Chen, Feng Wu, Jieping Ye
NeurIPS 2025 Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective Yang Zhang, Xinran Li, Jianing Ye, Shuang Qiu, Delin Qu, Xiu Li, Chongjie Zhang, Chenjia Bai
NeurIPS 2025 Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models Yiran Guo, Lijie Xu, Jie Liu, Ye Dan, Shuang Qiu
ICLR 2025 Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling Jiawei Xu, Rui Yang, Shuang Qiu, Feng Luo, Meng Fang, Baoxiang Wang, Lei Han
JMLR 2024 Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach Shuang Qiu, Boxiang Lyu, Qinglin Meng, Zhaoran Wang, Zhuoran Yang, Michael I. Jordan
ICML 2024 Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning Dake Zhang, Boxiang Lyu, Shuang Qiu, Mladen Kolar, Tong Zhang
ICML 2024 Rewards-in-Context: Multi-Objective Alignment of Foundation Models with Dynamic Preference Adjustment Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen
AAAI 2023 Gradient-Variation Bound for Online Convex Optimization with Constraints Shuang Qiu, Xiaohan Wei, Mladen Kolar
ICLR 2023 Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics Sirui Zheng, Lingxiao Wang, Shuang Qiu, Zuyue Fu, Zhuoran Yang, Csaba Szepesvari, Zhaoran Wang
NeurIPS 2023 Posterior Sampling for Competitive RL: Function Approximation and Partial Observation Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang
ICML 2022 Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang
ICML 2021 On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang
ICML 2021 Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions Shuang Qiu, Xiaohan Wei, Jieping Ye, Zhaoran Wang, Zhuoran Yang
CVPR 2021 Stylized Neural Painting Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, Zhenwei Shi
ICML 2020 Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis Shuang Qiu, Xiaohan Wei, Zhuoran Yang
NeurIPS 2020 Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial Loss Shuang Qiu, Xiaohan Wei, Zhuoran Yang, Jieping Ye, Zhaoran Wang
ECCVW 2020 VisDrone-CC2020: The Vision Meets Drone Crowd Counting Challenge Results Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, Qinghua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Máadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
NeurIPSW 2019 Robust One-Bit Recovery via ReLU Generative Networks: Improved Statistical Rate and Global Landscape Analysis Shuang Qiu, Xiaohan Wei, Zhuoran Yang
AAAI 2019 Which Factorization Machine Modeling Is Better: A Theoretical Answer with Optimal Guarantee Ming Lin, Shuang Qiu, Jieping Ye, Xiaomin Song, Qi Qian, Liang Sun, Shenghuo Zhu, Rong Jin
AAAI 2014 Recommendation by Mining Multiple User Behaviors with Group Sparsity Ting Yuan, Jian Cheng, Xi Zhang, Shuang Qiu, Hanqing Lu