Dynamic Abstraction in Reinforcement Learning via Clustering

Abstract

We consider a graph theoretic approach for automatic construction of options in a dynamic environment. A map of the environment is generated on-line by the learning agent, representing the topological structure of the state transitions. A clustering algorithm is then used to partition the state space to different regions. Policies for reaching the different parts of the space are separately learned and added to the model in a form of options (macro-actions). The options are used for accelerating the Q-Learning algorithm. We extend the basic algorithm and consider building a map that includes preliminary indication of the location of ``interesting'' regions of the state space, where the value gradient is significant and additional exploration might be beneficial. Experiments indicate significant speedups, especially in the initial learning phase.

Cite

Text

Mannor et al. "Dynamic Abstraction in Reinforcement Learning via Clustering." International Conference on Machine Learning, 2004. doi:10.1145/1015330.1015355

Markdown

[Mannor et al. "Dynamic Abstraction in Reinforcement Learning via Clustering." International Conference on Machine Learning, 2004.](https://mlanthology.org/icml/2004/mannor2004icml-dynamic/) doi:10.1145/1015330.1015355

BibTeX

@inproceedings{mannor2004icml-dynamic,
  title     = {{Dynamic Abstraction in Reinforcement Learning via Clustering}},
  author    = {Mannor, Shie and Menache, Ishai and Hoze, Amit and Klein, Uri},
  booktitle = {International Conference on Machine Learning},
  year      = {2004},
  doi       = {10.1145/1015330.1015355},
  url       = {https://mlanthology.org/icml/2004/mannor2004icml-dynamic/}
}