Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison

Taylor, Matthew E.; Whiteson, Shimon; Stone, Peter

Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison

Matthew E. Taylor, Shimon Whiteson, Peter Stone

AAAI 2007 pp. 1675-1678

/aaai/2007/taylor2007aaai-temporal/

Abstract

Reinforcement learning (RL) methods have become popular in recent years because of their ability to solve complex tasks with minimal feedback. Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving difficult RL problems, but few rigorous comparisons have been conducted. Thus, no general guidelines describing the methods ’ relative strengths and weaknesses are available. This paper summarizes a detailed empirical comparison be-tween a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. The results from this study help isolate the factors critical to the performance of each learning method and yield insights into their general strengths and weaknesses.

PDF AAAI Semantic Scholar

Cite

Text

Taylor et al. "Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison." AAAI Conference on Artificial Intelligence, 2007.

Markdown

[Taylor et al. "Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison." AAAI Conference on Artificial Intelligence, 2007.](https://mlanthology.org/aaai/2007/taylor2007aaai-temporal/)

BibTeX

@inproceedings{taylor2007aaai-temporal,
  title     = {{Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison}},
  author    = {Taylor, Matthew E. and Whiteson, Shimon and Stone, Peter},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {1675-1678},
  url       = {https://mlanthology.org/aaai/2007/taylor2007aaai-temporal/}
}