Model-Based Episodic Memory Induces Dynamic Hybrid Controls
Abstract
Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic hybrid control unifying model-based, episodic and habitual learning into a single architecture. Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings.
Cite
Text
Le et al. "Model-Based Episodic Memory Induces Dynamic Hybrid Controls." Neural Information Processing Systems, 2021.Markdown
[Le et al. "Model-Based Episodic Memory Induces Dynamic Hybrid Controls." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/le2021neurips-modelbased/)BibTeX
@inproceedings{le2021neurips-modelbased,
title = {{Model-Based Episodic Memory Induces Dynamic Hybrid Controls}},
author = {Le, Hung and George, Thommen Karimpanal and Abdolshah, Majid and Tran, Truyen and Venkatesh, Svetha},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/le2021neurips-modelbased/}
}