Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
Abstract
Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner? In this work, we present a functional reward encoding (FRE) as a general, scalable solution to this zero-shot RL problem. Our main idea is to learn functional representations of any arbitrary tasks by encoding their state-reward samples using a transformer-based variational auto-encoder. This functional encoding not only enables the pre-training of an agent from a wide diversity of general unsupervised reward functions, but also provides a way to solve any new downstream tasks in a zero-shot manner, given a small number of reward-annotated samples. We empirically show that FRE agents trained on diverse random unsupervised reward functions can generalize to solve novel tasks in a range of simulated robotic benchmarks, often outperforming previous zero-shot RL and offline RL methods.
Cite
Text
Frans et al. "Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings." International Conference on Machine Learning, 2024.Markdown
[Frans et al. "Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/frans2024icml-unsupervised/)BibTeX
@inproceedings{frans2024icml-unsupervised,
title = {{Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings}},
author = {Frans, Kevin and Park, Seohong and Abbeel, Pieter and Levine, Sergey},
booktitle = {International Conference on Machine Learning},
year = {2024},
pages = {13927-13942},
volume = {235},
url = {https://mlanthology.org/icml/2024/frans2024icml-unsupervised/}
}