The Successful Ingredients of Policy Gradient Algorithms
Abstract
Despite the sublime success in recent years, the underlying mechanisms powering the advances of reinforcement learning are yet poorly understood. In this paper, we identify these mechanisms - which we call ingredients - in on-policy policy gradient methods and empirically determine their impact on the learning. To allow an equitable assessment, we conduct our experiments based on a unified and modular implementation. Our results underline the significance of recent algorithmic advances and demonstrate that reaching state-of-the-art performance may not need sophisticated algorithms but can also be accomplished by the combination of a few simple ingredients.
Cite
Text
Gronauer et al. "The Successful Ingredients of Policy Gradient Algorithms." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/338Markdown
[Gronauer et al. "The Successful Ingredients of Policy Gradient Algorithms." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/gronauer2021ijcai-successful/) doi:10.24963/IJCAI.2021/338BibTeX
@inproceedings{gronauer2021ijcai-successful,
title = {{The Successful Ingredients of Policy Gradient Algorithms}},
author = {Gronauer, Sven and Gottwald, Martin and Diepold, Klaus},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2021},
pages = {2455-2461},
doi = {10.24963/IJCAI.2021/338},
url = {https://mlanthology.org/ijcai/2021/gronauer2021ijcai-successful/}
}