Zhang, Chongjie
77 publications
ICLR
2026
OPRIDE: Efficient Offline Preference-Based Reinforcement Learning via In-Dataset Exploration
NeurIPS
2025
A Bayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
ICLR
2025
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
NeurIPS
2022
LAPO: Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning
NeurIPSW
2022
Model and Method: Training-Time Attack for Cooperative Multi-Agent Reinforcement Learning