Action Elimination and Stopping Conditions for Reinforcement Learning
Abstract
We consider incorporating action elimination procedures in reinforcement learning algorithms. We suggest a framework that is based on learning an upper and a lower estimates of the value function or the Q-function and eliminating actions that are not optimal. We provide a model-based and a model-free variants of the elimination method. We further derive stopping conditions that guarantee that the learned policy is approximately optimal with high probability. Simulations demonstrate a considerable speedup and added robustness. ICML Proceedings of the Twentieth International Conference on Machine Learning
Cite
Text
Even-Dar et al. "Action Elimination and Stopping Conditions for Reinforcement Learning." International Conference on Machine Learning, 2003.Markdown
[Even-Dar et al. "Action Elimination and Stopping Conditions for Reinforcement Learning." International Conference on Machine Learning, 2003.](https://mlanthology.org/icml/2003/evendar2003icml-action/)BibTeX
@inproceedings{evendar2003icml-action,
title = {{Action Elimination and Stopping Conditions for Reinforcement Learning}},
author = {Even-Dar, Eyal and Mannor, Shie and Mansour, Yishay},
booktitle = {International Conference on Machine Learning},
year = {2003},
pages = {162-169},
url = {https://mlanthology.org/icml/2003/evendar2003icml-action/}
}