Sample Efficient Actor-Critic with Experience Replay
Abstract
This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
Cite
Text
Wang et al. "Sample Efficient Actor-Critic with Experience Replay." International Conference on Learning Representations, 2017.Markdown
[Wang et al. "Sample Efficient Actor-Critic with Experience Replay." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/wang2017iclr-sample/)BibTeX
@inproceedings{wang2017iclr-sample,
title = {{Sample Efficient Actor-Critic with Experience Replay}},
author = {Wang, Ziyu and Bapst, Victor and Heess, Nicolas and Mnih, Volodymyr and Munos, Rémi and Kavukcuoglu, Koray and de Freitas, Nando},
booktitle = {International Conference on Learning Representations},
year = {2017},
url = {https://mlanthology.org/iclr/2017/wang2017iclr-sample/}
}