GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models
Abstract
Offline goal-conditioned RL (GCRL) offers a feasible paradigm to learn general-purpose policies from diverse and multi-task offline datasets. Despite notable recent progress, the predominant offline GCRL methods have been restricted to model-free approaches, constraining their capacity to tackle limited data budgets and unseen goal generalization. In this work, we propose a novel two-stage model-based framework, Goal-conditioned Offline Planning (GOPlan), including (1) pretraining a prior policy capable of capturing multi-modal action distribution within the multi-goal dataset; (2) employing the reanalysis method with planning to generate imagined trajectories for funetuning policies. Specifically, the prior policy is based on an advantage-weighted Conditioned Generative Adversarial Networks that exhibits distinct mode separation to overcome the pitfalls of out-of-distribution (OOD) actions. For further policy optimization, the reanalysis method generates high-quality imaginary data by planning with learned models for both intra-trajectory and inter-trajectory goals. Through experimental evaluations, we demonstrate that GOPlan achieves state-of-the-art performance on various offline multi-goal manipulation tasks. Moreover, our results highlight the superior ability of GOPlan to handle small data budgets and generalize to OOD goals.
Cite
Text
Wang et al. "GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models." NeurIPS 2023 Workshops: GCRL, 2023.Markdown
[Wang et al. "GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models." NeurIPS 2023 Workshops: GCRL, 2023.](https://mlanthology.org/neuripsw/2023/wang2023neuripsw-goplan/)BibTeX
@inproceedings{wang2023neuripsw-goplan,
title = {{GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models}},
author = {Wang, Mianchu and Yang, Rui and Chen, Xi and Fang, Meng},
booktitle = {NeurIPS 2023 Workshops: GCRL},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/wang2023neuripsw-goplan/}
}