Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games

Sudhakar, Arjun V; Parthasarathi, Prasanna; Rajendran, Janarthanan; Chandar, Sarath

Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games

Arjun V Sudhakar, Prasanna Parthasarathi, Janarthanan Rajendran, Sarath Chandar

ICMLW 2024

/icmlw/2024/sudhakar2024icmlw-language/

Abstract

Large Language Models (LLMs) have demonstrated superior performance in language understanding benchmarks. A recent use case for LLMs involves training decision-making agents over textual information. The existing approach leverages LLM's linguistic priors for action candidate recommendations in text games, i.e., to operate without environment-provided actions. However, adapting LLMs to specific games/tasks requires a massive amount of annotated human gameplay. Moreover, in the existing approach, the language model was kept frozen during an agent's training process, which limits learning from in-game knowledge about the world. Hence, we explore strategies to adapt the language model for candidate recommendation with in-game transition in an online learning fashion to mitigate reliance on human-annotated gameplays, which are costly to acquire. In this paper, we propose in-game transition selection methods to adapt the LLM in the loop, reducing the dependency on using human-annotated gameplays while improving performance and convergence. Our method demonstrates a 53% relative improvement in average game score over the previous state-of-the-art model, achieving more than twice the convergence rate in a full-annotated dataset setting. Furthermore, even with only 10% of human annotation, we surpassed the 100\% state-of-the-art performance benchmark.

PDF ICMLW OpenReview Semantic Scholar

Cite

Text

Sudhakar et al. "Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games." ICML 2024 Workshops: FM-Wild, 2024.

Markdown

[Sudhakar et al. "Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games." ICML 2024 Workshops: FM-Wild, 2024.](https://mlanthology.org/icmlw/2024/sudhakar2024icmlw-language/)

BibTeX

@inproceedings{sudhakar2024icmlw-language,
  title     = {{Language Model-in-the-Loop: Data Optimal Approach to Recommend Actions in Text Games}},
  author    = {Sudhakar, Arjun V and Parthasarathi, Prasanna and Rajendran, Janarthanan and Chandar, Sarath},
  booktitle = {ICML 2024 Workshops: FM-Wild},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/sudhakar2024icmlw-language/}
}