Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Abstract
Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text. We propose a novel graph-aided transformer agent (GATA) that infers and updates latent belief graphs during planning to enable effective action selection by capturing the underlying game dynamics. GATA is trained using a combination of reinforcement and self-supervised learning. Our work demonstrates that the learned graph-based representations help agents converge to better policies than their text-only counterparts and facilitate effective generalization across game configurations. Experiments on 500+ unique games from the TextWorld suite show that our best agent outperforms text-based baselines by an average of 24.2%.
Cite
Text
Adhikari et al. "Learning Dynamic Belief Graphs to Generalize on Text-Based Games." Neural Information Processing Systems, 2020.Markdown
[Adhikari et al. "Learning Dynamic Belief Graphs to Generalize on Text-Based Games." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/adhikari2020neurips-learning/)BibTeX
@inproceedings{adhikari2020neurips-learning,
title = {{Learning Dynamic Belief Graphs to Generalize on Text-Based Games}},
author = {Adhikari, Ashutosh and Yuan, Xingdi and Côté, Marc-Alexandre and Zelinka, Mikuláš and Rondeau, Marc-Antoine and Laroche, Romain and Poupart, Pascal and Tang, Jian and Trischler, Adam and Hamilton, Will},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/adhikari2020neurips-learning/}
}