Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU
Abstract
We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at this https URL .
Cite
Text
Babaeizadeh et al. "Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU." International Conference on Learning Representations, 2017.Markdown
[Babaeizadeh et al. "Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/babaeizadeh2017iclr-reinforcement/)BibTeX
@inproceedings{babaeizadeh2017iclr-reinforcement,
title = {{Reinforcement Learning Through Asynchronous Advantage Actor-Critic on a GPU}},
author = {Babaeizadeh, Mohammad and Frosio, Iuri and Tyree, Stephen and Clemons, Jason and Kautz, Jan},
booktitle = {International Conference on Learning Representations},
year = {2017},
url = {https://mlanthology.org/iclr/2017/babaeizadeh2017iclr-reinforcement/}
}