Autotelic LLM-Based Exploration for Goal-Conditioned RL
Abstract
Autotelic agents, capable of autonomously generating and pursuing their own goals, a represent promising approach to open-ended learning and skill acquisition in reinforcement learning. Such agents learn to set and pursue their own goals. This challenge is even more difficult in open worlds that require inventing new previously unobserved goals. In this work, we propose an architecture where a single generalist autotelic agent is trained on an automatic curriculum of goals. We leverage a large language models (LLMs) to generate goals as code for reward functions based on learnability and difficulty estimates. The goal-conditioned RL agent is trained on those goals sampled based on learning progress. We compare our method to an adaptation of OMNI-EPIC to goal-conditioned RL. Our preliminary experiments imply that our method generates a higher proportion of learnable goals, suggesting better adaptation to the goal-conditioned learner.
Cite
Text
Pourcel et al. "Autotelic LLM-Based Exploration for Goal-Conditioned RL." NeurIPS 2024 Workshops: IMOL, 2024.Markdown
[Pourcel et al. "Autotelic LLM-Based Exploration for Goal-Conditioned RL." NeurIPS 2024 Workshops: IMOL, 2024.](https://mlanthology.org/neuripsw/2024/pourcel2024neuripsw-autotelic/)BibTeX
@inproceedings{pourcel2024neuripsw-autotelic,
title = {{Autotelic LLM-Based Exploration for Goal-Conditioned RL}},
author = {Pourcel, Guillaume and Carta, Thomas and Kovač, Grgur and Oudeyer, Pierre-Yves},
booktitle = {NeurIPS 2024 Workshops: IMOL},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/pourcel2024neuripsw-autotelic/}
}