The Successful Ingredients of Policy Gradient Algorithms

Abstract

Despite the sublime success in recent years, the underlying mechanisms powering the advances of reinforcement learning are yet poorly understood. In this paper, we identify these mechanisms - which we call ingredients - in on-policy policy gradient methods and empirically determine their impact on the learning. To allow an equitable assessment, we conduct our experiments based on a unified and modular implementation. Our results underline the significance of recent algorithmic advances and demonstrate that reaching state-of-the-art performance may not need sophisticated algorithms but can also be accomplished by the combination of a few simple ingredients.

Cite

Text

Gronauer et al. "The Successful Ingredients of Policy Gradient Algorithms." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/338

Markdown

[Gronauer et al. "The Successful Ingredients of Policy Gradient Algorithms." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/gronauer2021ijcai-successful/) doi:10.24963/IJCAI.2021/338

BibTeX

@inproceedings{gronauer2021ijcai-successful,
  title     = {{The Successful Ingredients of Policy Gradient Algorithms}},
  author    = {Gronauer, Sven and Gottwald, Martin and Diepold, Klaus},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {2455-2461},
  doi       = {10.24963/IJCAI.2021/338},
  url       = {https://mlanthology.org/ijcai/2021/gronauer2021ijcai-successful/}
}