Multi-Step Greedy Reinforcement Learning Algorithms

Tomar, Manan; Efroni, Yonathan; Ghavamzadeh, Mohammad

Multi-Step Greedy Reinforcement Learning Algorithms

Manan Tomar, Yonathan Efroni, Mohammad Ghavamzadeh

ICML 2020 pp. 9504-9513

/icml/2020/tomar2020icml-multistep/

Abstract

Multi-step greedy policies have been extensively used in model-based reinforcement learning (RL), both when a model of the environment is available (e.g., in the game of Go) and when it is learned. In this paper, we explore their benefits in model-free RL, when employed using multi-step dynamic programming algorithms: $\kappa$-Policy Iteration ($\kappa$-PI) and $\kappa$-Value Iteration ($\kappa$-VI). These methods iteratively compute the next policy ($\kappa$-PI) and value function ($\kappa$-VI) by solving a surrogate decision problem with a shaped reward and a smaller discount factor. We derive model-free RL algorithms based on $\kappa$-PI and $\kappa$-VI in which the surrogate problem can be solved by any discrete or continuous action RL method, such as DQN and TRPO. We identify the importance of a hyper-parameter that controls the extent to which the surrogate problem is solved and suggest a way to set this parameter. When evaluated on a range of Atari and MuJoCo benchmark tasks, our results indicate that for the right range of $\kappa$, our algorithms outperform DQN and TRPO. This shows that our multi-step greedy algorithms are general enough to be applied over any existing RL algorithm and can significantly improve its performance.

PDF ICML Semantic Scholar

Cite

Text

Tomar et al. "Multi-Step Greedy Reinforcement Learning Algorithms." International Conference on Machine Learning, 2020.

Markdown

[Tomar et al. "Multi-Step Greedy Reinforcement Learning Algorithms." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/tomar2020icml-multistep/)

BibTeX

@inproceedings{tomar2020icml-multistep,
  title     = {{Multi-Step Greedy Reinforcement Learning Algorithms}},
  author    = {Tomar, Manan and Efroni, Yonathan and Ghavamzadeh, Mohammad},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {9504-9513},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/tomar2020icml-multistep/}
}