Compositional Instruction Following with Language Models and Reinforcement Learning
Abstract
Combining reinforcement learning with language grounding is challenging as the agent needs to explore the environment while simultaneously learning multiple language-conditioned tasks. To address this, we introduce a novel method: the compositionally-enabled reinforcement learning language agent (CERLLA). Our method reduces the sample complexity of tasks specified with language by leveraging compositional policy representations and a semantic parser trained using reinforcement learning and in-context learning. We evaluate our approach in an environment requiring function approximation and demonstrate compositional generalization to novel tasks. Our method significantly outperforms the previous best non-compositional baseline in terms of sample complexity on 162 tasks designed to test compositional generalization. Our model attains a higher success rate and learns in fewer steps than the non-compositional baseline. It reaches a success rate equal to an oracle policy's upper-bound performance of 92%. With the same number of environment steps, the baseline only reaches a success rate of 80%.
Cite
Text
Cohen et al. "Compositional Instruction Following with Language Models and Reinforcement Learning." Transactions on Machine Learning Research, 2024.Markdown
[Cohen et al. "Compositional Instruction Following with Language Models and Reinforcement Learning." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/cohen2024tmlr-compositional/)BibTeX
@article{cohen2024tmlr-compositional,
title = {{Compositional Instruction Following with Language Models and Reinforcement Learning}},
author = {Cohen, Vanya and Tasse, Geraud Nangue and Gopalan, Nakul and James, Steven and Gombolay, Matthew and Mooney, Ray and Rosman, Benjamin},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/cohen2024tmlr-compositional/}
}