Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games
Abstract
Large Language Models (LLMs) have demonstrated superior performance in language understanding benchmarks. A recent use case for LLMs involves training decision-making agents over textual information. The existing approach leverages LLM's linguistic priors for action candidate recommendations in text games, i.e., to operate without environment-provided actions. However, adapting LLMs to specific games/tasks requires a massive amount of annotated human gameplay. Moreover, in the existing approach, the language model was kept frozen during an agent's training process, which limits learning from in-game knowledge about the world. Hence, we explore strategies to adapt the language model for candidate recommendation with in-game transition in an online learning fashion to mitigate reliance on human-annotated gameplays, which are costly to acquire. In this paper, we propose in-game transition selection methods to adapt the LLM in the loop, reducing the dependency on using human-annotated gameplays while improving performance and convergence. Our method demonstrates a 53% relative improvement in average game score over the previous state-of-the-art model, achieving more than twice the convergence rate in a full-annotated dataset setting. Furthermore, even with only 10% of human annotation, we surpassed the 100\% state-of-the-art performance benchmark.
Cite
Text
Sudhakar et al. "Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games." ICML 2024 Workshops: FM-Wild, 2024.Markdown
[Sudhakar et al. "Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games." ICML 2024 Workshops: FM-Wild, 2024.](https://mlanthology.org/icmlw/2024/sudhakar2024icmlw-language/)BibTeX
@inproceedings{sudhakar2024icmlw-language,
title = {{Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games}},
author = {Sudhakar, Arjun V and Parthasarathi, Prasanna and Rajendran, Janarthanan and Chandar, Sarath},
booktitle = {ICML 2024 Workshops: FM-Wild},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/sudhakar2024icmlw-language/}
}