Model-Based Episodic Memory Induces Dynamic Hybrid Controls

Abstract

Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic hybrid control unifying model-based, episodic and habitual learning into a single architecture. Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings.

Cite

Text

Le et al. "Model-Based Episodic Memory Induces Dynamic Hybrid Controls." Neural Information Processing Systems, 2021.

Markdown

[Le et al. "Model-Based Episodic Memory Induces Dynamic Hybrid Controls." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/le2021neurips-modelbased/)

BibTeX

@inproceedings{le2021neurips-modelbased,
  title     = {{Model-Based Episodic Memory Induces Dynamic Hybrid Controls}},
  author    = {Le, Hung and George, Thommen Karimpanal and Abdolshah, Majid and Tran, Truyen and Venkatesh, Svetha},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/le2021neurips-modelbased/}
}