Generalization of Reinforcement Learners with Working and Episodic Memory

Fortunato, Meire; Tan, Melissa; Faulkner, Ryan; Hansen, Steven; Badia, Adrià Puigdomènech; Buttimore, Gavin; Deck, Charles; Leibo, Joel Z.; Blundell, Charles

Generalization of Reinforcement Learners with Working and Episodic Memory

Meire Fortunato, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charles Deck, Joel Z. Leibo, Charles Blundell

NeurIPS 2019 pp. 12469-12478

/neurips/2019/fortunato2019neurips-generalization/

Abstract

Memory is an important aspect of intelligence and plays a role in many deep reinforcement learning models. However, little progress has been made in understanding when specific memory systems help more than others and how well they generalize. The field also has yet to see a prevalent consistent and rigorous approach for evaluating agent performance on holdout data. In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization. To that end, we first construct a diverse set of memory tasks that allow us to evaluate test-time generalization across multiple dimensions. Second, we develop and perform multiple ablations on an agent architecture that combines multiple memory systems, observe its baseline models, and investigate its performance against the task suite.

PDF NeurIPS Semantic Scholar

Cite

Text

Fortunato et al. "Generalization of Reinforcement Learners with Working and Episodic Memory." Neural Information Processing Systems, 2019.

Markdown

[Fortunato et al. "Generalization of Reinforcement Learners with Working and Episodic Memory." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/fortunato2019neurips-generalization/)

BibTeX

@inproceedings{fortunato2019neurips-generalization,
  title     = {{Generalization of Reinforcement Learners with Working and Episodic Memory}},
  author    = {Fortunato, Meire and Tan, Melissa and Faulkner, Ryan and Hansen, Steven and Badia, Adrià Puigdomènech and Buttimore, Gavin and Deck, Charles and Leibo, Joel Z. and Blundell, Charles},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {12469-12478},
  url       = {https://mlanthology.org/neurips/2019/fortunato2019neurips-generalization/}
}