An Investigation into Value-Implicit Pre-Training for Task-Agnostic, Sample-Efficient Goal-Conditioned Reinforcement Learning
Abstract
One of the primary challenges of learning a diverse set of robotic manipulation skills from raw sensory observations is to learn a universal reward function that can be used for unseen tasks. To address this challenge, a recent breakthrough called value-implicit pre-training (VIP) has been proposed. VIP provides a self-supervised pre-trained visual representation that exhibits the capability to generate dense and smooth reward functions for unseen robotic tasks. In this paper, we explore the feasibility of VIP’s goal-conditioned reward specification with the goal of achieving task-agnostic, sample-efficient goal-conditioned reinforcement learning (RL). Our investigation involves an evaluation of online RL by means of VIP-generated rewards instead of human-crafted reward signals on goal-image-specified robotic manipulation tasks from Meta-World under a highly limited interaction. We find the combination of the following three techniques: combining VIP-generated rewards with sparse task-completion rewards, policy pre-training using expert demonstration data via behavior cloning before RL training, and oversampling of the demonstrated data during the RL training, leads to a greater acceleration of online RL compared to utilizing VIP-generated rewards in isolation.
Cite
Text
Noh et al. "An Investigation into Value-Implicit Pre-Training for Task-Agnostic, Sample-Efficient Goal-Conditioned Reinforcement Learning." NeurIPS 2023 Workshops: GCRL, 2023.Markdown
[Noh et al. "An Investigation into Value-Implicit Pre-Training for Task-Agnostic, Sample-Efficient Goal-Conditioned Reinforcement Learning." NeurIPS 2023 Workshops: GCRL, 2023.](https://mlanthology.org/neuripsw/2023/noh2023neuripsw-investigation/)BibTeX
@inproceedings{noh2023neuripsw-investigation,
title = {{An Investigation into Value-Implicit Pre-Training for Task-Agnostic, Sample-Efficient Goal-Conditioned Reinforcement Learning}},
author = {Noh, Samyeul and Kim, Seonghyun and Jang, Ingook and Myung, Hyun},
booktitle = {NeurIPS 2023 Workshops: GCRL},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/noh2023neuripsw-investigation/}
}