Zhang, Chongjie
74 publications
NeurIPS
2025
A Bayesian Fast-Slow Framework to Mitigate Interference in Non-Stationary Reinforcement Learning
ICLR
2025
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
NeurIPS
2022
LAPO: Latent-Variable Advantage-Weighted Policy Optimization for Offline Reinforcement Learning
NeurIPSW
2022
Model and Method: Training-Time Attack for Cooperative Multi-Agent Reinforcement Learning