Action Elimination and Stopping Conditions for Reinforcement Learning

Abstract

We consider incorporating action elimination procedures in reinforcement learning algorithms. We suggest a framework that is based on learning an upper and a lower estimates of the value function or the Q-function and eliminating actions that are not optimal. We provide a model-based and a model-free variants of the elimination method. We further derive stopping conditions that guarantee that the learned policy is approximately optimal with high probability. Simulations demonstrate a considerable speedup and added robustness. ICML Proceedings of the Twentieth International Conference on Machine Learning

Cite

Text

Even-Dar et al. "Action Elimination and Stopping Conditions for Reinforcement Learning." International Conference on Machine Learning, 2003.

Markdown

[Even-Dar et al. "Action Elimination and Stopping Conditions for Reinforcement Learning." International Conference on Machine Learning, 2003.](https://mlanthology.org/icml/2003/evendar2003icml-action/)

BibTeX

@inproceedings{evendar2003icml-action,
  title     = {{Action Elimination and Stopping Conditions for Reinforcement Learning}},
  author    = {Even-Dar, Eyal and Mannor, Shie and Mansour, Yishay},
  booktitle = {International Conference on Machine Learning},
  year      = {2003},
  pages     = {162-169},
  url       = {https://mlanthology.org/icml/2003/evendar2003icml-action/}
}