GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models

Wang, Mianchu; Yang, Rui; Chen, Xi; Fang, Meng

GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models

Mianchu Wang, Rui Yang, Xi Chen, Meng Fang

NeurIPSW 2023

/neuripsw/2023/wang2023neuripsw-goplan/

Abstract

Offline goal-conditioned RL (GCRL) offers a feasible paradigm to learn general-purpose policies from diverse and multi-task offline datasets. Despite notable recent progress, the predominant offline GCRL methods have been restricted to model-free approaches, constraining their capacity to tackle limited data budgets and unseen goal generalization. In this work, we propose a novel two-stage model-based framework, Goal-conditioned Offline Planning (GOPlan), including (1) pretraining a prior policy capable of capturing multi-modal action distribution within the multi-goal dataset; (2) employing the reanalysis method with planning to generate imagined trajectories for funetuning policies. Specifically, the prior policy is based on an advantage-weighted Conditioned Generative Adversarial Networks that exhibits distinct mode separation to overcome the pitfalls of out-of-distribution (OOD) actions. For further policy optimization, the reanalysis method generates high-quality imaginary data by planning with learned models for both intra-trajectory and inter-trajectory goals. Through experimental evaluations, we demonstrate that GOPlan achieves state-of-the-art performance on various offline multi-goal manipulation tasks. Moreover, our results highlight the superior ability of GOPlan to handle small data budgets and generalize to OOD goals.

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Wang et al. "GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models." NeurIPS 2023 Workshops: GCRL, 2023.

Markdown

[Wang et al. "GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models." NeurIPS 2023 Workshops: GCRL, 2023.](https://mlanthology.org/neuripsw/2023/wang2023neuripsw-goplan/)

BibTeX

@inproceedings{wang2023neuripsw-goplan,
  title     = {{GOPlan: Goal-Conditioned Offline Reinforcement Learning by Planning with Learned Models}},
  author    = {Wang, Mianchu and Yang, Rui and Chen, Xi and Fang, Meng},
  booktitle = {NeurIPS 2023 Workshops: GCRL},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/wang2023neuripsw-goplan/}
}