Sample Efficient Actor-Critic with Experience Replay

Abstract

This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.

Cite

Text

Wang et al. "Sample Efficient Actor-Critic with Experience Replay." International Conference on Learning Representations, 2017.

Markdown

[Wang et al. "Sample Efficient Actor-Critic with Experience Replay." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/wang2017iclr-sample/)

BibTeX

@inproceedings{wang2017iclr-sample,
  title     = {{Sample Efficient Actor-Critic with Experience Replay}},
  author    = {Wang, Ziyu and Bapst, Victor and Heess, Nicolas and Mnih, Volodymyr and Munos, Rémi and Kavukcuoglu, Koray and de Freitas, Nando},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/wang2017iclr-sample/}
}