Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions

Abstract

In the context of hierarchical reinforcement learning, the idea of hierarchies of abstract machines (HAMs) is to write a partial policy as a set of hierarchical finite state machines with unspecified choice states, and use reinforcement learning to learn an optimal completion of this partial policy. Given a HAM with potentially deep hierarchical structure, there often exist many internal transitions where a machine calls another machine with the environment state unchanged. In this paper, we propose a new hierarchical reinforcement learning algorithm that discovers such internal transitions automatically, and shortcircuits them recursively in computation of Q values. The resulting HAMQ-INT algorithm outperforms the state of the art significantly on the benchmark Taxi domain and a much more complex RoboCup Keepaway domain.

Cite

Text

Bai and Russell. "Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/196

Markdown

[Bai and Russell. "Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/bai2017ijcai-efficient/) doi:10.24963/IJCAI.2017/196

BibTeX

@inproceedings{bai2017ijcai-efficient,
  title     = {{Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions}},
  author    = {Bai, Aijun and Russell, Stuart},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {1418-1424},
  doi       = {10.24963/IJCAI.2017/196},
  url       = {https://mlanthology.org/ijcai/2017/bai2017ijcai-efficient/}
}