Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU

Abstract

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at this https URL .

Cite

Text

Babaeizadeh et al. "Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU." International Conference on Learning Representations, 2017.

Markdown

[Babaeizadeh et al. "Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/babaeizadeh2017iclr-reinforcement/)

BibTeX

@inproceedings{babaeizadeh2017iclr-reinforcement,
  title     = {{Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU}},
  author    = {Babaeizadeh, Mohammad and Frosio, Iuri and Tyree, Stephen and Clemons, Jason and Kautz, Jan},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/babaeizadeh2017iclr-reinforcement/}
}