Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search

Heidrich-Meisner, Verena; Igel, Christian

doi:10.1145/1553374.1553426

Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search

Verena Heidrich-Meisner, Christian Igel

ICML 2009 pp. 401-408

doi:10.1145/1553374.1553426 /icml/2009/heidrichmeisner2009icml-hoeffding/

Abstract

Uncertainty arises in reinforcement learning from various sources, and therefore it is necessary to consider statistics based on several roll-outs for evaluating behavioral policies. We add an adaptive uncertainty handling based on Hoeffding and empirical Bernstein races to the CMA-ES, a variable metric evolution strategy proposed for direct policy search. The uncertainty handling adjusts individually the number of episodes considered for the evaluation of a policy. The performance estimation is kept just accurate enough for a sufficiently good ranking of candidate policies, which is in turn sufficient for the CMA-ES to find better solutions. This increases the learning speed as well as the robustness of the algorithm.

PDF ICML Semantic Scholar

Cite

Text

Heidrich-Meisner and Igel. "Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search." International Conference on Machine Learning, 2009. doi:10.1145/1553374.1553426

Markdown

[Heidrich-Meisner and Igel. "Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search." International Conference on Machine Learning, 2009.](https://mlanthology.org/icml/2009/heidrichmeisner2009icml-hoeffding/) doi:10.1145/1553374.1553426

BibTeX

@inproceedings{heidrichmeisner2009icml-hoeffding,
  title     = {{Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search}},
  author    = {Heidrich-Meisner, Verena and Igel, Christian},
  booktitle = {International Conference on Machine Learning},
  year      = {2009},
  pages     = {401-408},
  doi       = {10.1145/1553374.1553426},
  url       = {https://mlanthology.org/icml/2009/heidrichmeisner2009icml-hoeffding/}
}