ML Anthology
Authors
Search
About
Ma, Xiaoteng
20 publications
ICLR
2025
Cross-Domain Offline Policy Adaptation with Optimal Transport and Dataset Constraint
Jiafei Lyu
,
Mengbei Yan
,
Zhongjian Qiao
,
Runze Liu
,
Xiaoteng Ma
,
Deheng Ye
,
Jing-Wen Yang
,
Zongqing Lu
,
Xiu Li
JAIR
2025
DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning
Xiaoteng Ma
,
Junyao Chen
,
Li Xia
,
Jun Yang
,
Qianchuan Zhao
,
Zhengyuan Zhou
ICLR
2025
Episodic Novelty Through Temporal Distance
Yuhua Jiang
,
Qihan Liu
,
Yiqin Yang
,
Xiaoteng Ma
,
Dianyu Zhong
,
Hao Hu
,
Jun Yang
,
Bin Liang
,
Bo Xu
,
Chongjie Zhang
,
Qianchuan Zhao
ICLR
2024
Efficient Multi-Agent Reinforcement Learning by Planning
Qihan Liu
,
Jianing Ye
,
Xiaoteng Ma
,
Jun Yang
,
Bin Liang
,
Chongjie Zhang
NeurIPSW
2024
Episodic Novelty Through Temporal Distance
Yuhua Jiang
,
Qihan Liu
,
Yiqin Yang
,
Xiaoteng Ma
,
Dianyu Zhong
,
Bo Xu
,
Jun Yang
,
Bin Liang
,
Chongjie Zhang
,
Qianchuan Zhao
AAAI
2024
Learning Diverse Risk Preferences in Population-Based Self-Play
Yuhua Jiang
,
Qihan Liu
,
Xiaoteng Ma
,
Chenghao Li
,
Yiqin Yang
,
Jun Yang
,
Bin Liang
,
Qianchuan Zhao
NeurIPS
2024
NeuralPlane: An Efficiently Parallelizable Platform for Fixed-Wing Aircraft Control with Reinforcement Learning
Chuanyi Xue
,
Qihan Liu
,
Xiaoteng Ma
,
Yang Qi
,
Xinyao Qin
,
Yuhua Jiang
,
Ning Gui
,
Jinsheng Ren
,
Bin Liang
,
Jun Yang
ICLR
2024
SEABO: A Simple Search-Based Method for Offline Imitation Learning
Jiafei Lyu
,
Xiaoteng Ma
,
Le Wan
,
Runze Liu
,
Xiu Li
,
Zongqing Lu
ICML
2024
Single-Trajectory Distributionally Robust Reinforcement Learning
Zhipeng Liang
,
Xiaoteng Ma
,
Jose Blanchet
,
Jun Yang
,
Jiheng Zhang
,
Zhengyuan Zhou
NeurIPS
2023
Cross-Domain Policy Adaptation via Value-Guided Data Filtering
Kang Xu
,
Chenjia Bai
,
Xiaoteng Ma
,
Dong Wang
,
Bin Zhao
,
Zhen Wang
,
Xuelong Li
,
Wei Li
IJCAI
2023
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning (Extended Abstract)
Xiaoteng Ma
,
Shuai Ma
,
Li Xia
,
Qianchuan Zhao
ICML
2023
What Is Essential for Unseen Goal Generalization of Offline Goal-Conditioned RL?
Rui Yang
,
Lin Yong
,
Xiaoteng Ma
,
Hao Hu
,
Chongjie Zhang
,
Tong Zhang
AAAI
2022
Efficient Continuous Control with Double Actors and Regularized Critics
Jiafei Lyu
,
Xiaoteng Ma
,
Jiangpeng Yan
,
Xiu Li
NeurIPS
2022
Exploit Reward Shifting in Value-Based Deep-RL: Optimistic Curiosity-Based Exploration and Conservative Exploitation via Linear Reward Shaping
Hao Sun
,
Lei Han
,
Rui Yang
,
Xiaoteng Ma
,
Jian Guo
,
Bolei Zhou
JAIR
2022
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning
Xiaoteng Ma
,
Shuai Ma
,
Li Xia
,
Qianchuan Zhao
NeurIPS
2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
,
Xiaoteng Ma
,
Xiu Li
,
Zongqing Lu
ICLR
2022
Offline Reinforcement Learning with Value-Based Episodic Memory
Xiaoteng Ma
,
Yiqin Yang
,
Hao Hu
,
Jun Yang
,
Chongjie Zhang
,
Qianchuan Zhao
,
Bin Liang
,
Qihan Liu
NeurIPS
2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
Rui Yang
,
Chenjia Bai
,
Xiaoteng Ma
,
Zhaoran Wang
,
Chongjie Zhang
,
Lei Han
IJCAI
2021
Average-Reward Reinforcement Learning with Trust Region Methods
Xiaoteng Ma
,
Xiaohang Tang
,
Li Xia
,
Jun Yang
,
Qianchuan Zhao
NeurIPS
2021
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
Yiqin Yang
,
Xiaoteng Ma
,
Chenghao Li
,
Zewu Zheng
,
Qiyuan Zhang
,
Gao Huang
,
Jun Yang
,
Qianchuan Zhao