Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning

Menache, Ishai; Mannor, Shie; Shimkin, Nahum

doi:10.1007/3-540-36755-1_25

Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning

Ishai Menache, Shie Mannor, Nahum Shimkin

ECML-PKDD 2002 pp. 295-306

doi:10.1007/3-540-36755-1_25 /ecmlpkdd/2002/menache2002ecml-qcut/

Abstract

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient Max-Flow/Min-Cut algorithm for identifying bottlenecks. The policies for reaching bottlenecks are separately learned and added to the model in a form of options (macro-actions). We then extend the basic Q-Cut algorithm to the Segmented Q-Cut algorithm, which uses previously identified bottlenecks for state space partitioning, necessary for finding additional bottlenecks in complex environments. Experiments show significant performance improvements, particulary in the initial learning phase.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Menache et al. "Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning." European Conference on Machine Learning, 2002. doi:10.1007/3-540-36755-1_25

Markdown

[Menache et al. "Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning." European Conference on Machine Learning, 2002.](https://mlanthology.org/ecmlpkdd/2002/menache2002ecml-qcut/) doi:10.1007/3-540-36755-1_25

BibTeX

@inproceedings{menache2002ecml-qcut,
  title     = {{Q-Cut - Dynamic Discovery of Sub-Goals in Reinforcement Learning}},
  author    = {Menache, Ishai and Mannor, Shie and Shimkin, Nahum},
  booktitle = {European Conference on Machine Learning},
  year      = {2002},
  pages     = {295-306},
  doi       = {10.1007/3-540-36755-1_25},
  url       = {https://mlanthology.org/ecmlpkdd/2002/menache2002ecml-qcut/}
}