Codeplay: Autotelic Learning Through Collaborative Self-Play in Programming Environments
Abstract
Autotelic learning is the training setup where agents learn by setting their own goals and trying to achieve them. However, creatively generating freeform goals is challenging for autotelic agents. We present Codeplay, an algorithm casting autotelic learning as a game between a Setter agent and a Solver agent, where the Setter generates programming puzzles of appropriate difficulty and novelty for the solver and the Solver learns to achieve them. Early experiments with the Setter demonstrates one can effectively control the tradeoff between difficulty of a puzzle and its novelty by tuning the reward of the Setter, a code language model finetuned with deep reinforcement learning.
Cite
Text
Teodorescu et al. "Codeplay: Autotelic Learning Through Collaborative Self-Play in Programming Environments." NeurIPS 2023 Workshops: IMOL, 2023.Markdown
[Teodorescu et al. "Codeplay: Autotelic Learning Through Collaborative Self-Play in Programming Environments." NeurIPS 2023 Workshops: IMOL, 2023.](https://mlanthology.org/neuripsw/2023/teodorescu2023neuripsw-codeplay/)BibTeX
@inproceedings{teodorescu2023neuripsw-codeplay,
title = {{Codeplay: Autotelic Learning Through Collaborative Self-Play in Programming Environments}},
author = {Teodorescu, Laetitia and Colas, Cédric and Bowers, Matthew and Carta, Thomas and Oudeyer, Pierre-Yves},
booktitle = {NeurIPS 2023 Workshops: IMOL},
year = {2023},
url = {https://mlanthology.org/neuripsw/2023/teodorescu2023neuripsw-codeplay/}
}