Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning
Abstract
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent’s optimal value function. In most real-world problems, learning this value function requires a function approxima-tor, which maps state-action pairs to values via a concise, parameterized function. In practice, the success of func-tion approximators depends on the ability of the human de-signer to select an appropriate representation for the value function. A recently developed approach called evolution-ary function approximation uses evolutionary computation to automate the search for effective representations. While this approach can substantially improve the performance of TD methods, it requires many sample episodes to do so. We present an enhancement to evolutionary function approxima-tion that makes it much more sample-efficient by exploiting the off-policy nature of certain TD methods. Empirical re-sults in a server job scheduling domain demonstrate that the enhanced method can learn better policies than evolution or TD methods alone and can do so in many fewer episodes than standard evolutionary function approximation.
Cite
Text
Whiteson and Stone. "Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2006.Markdown
[Whiteson and Stone. "Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2006.](https://mlanthology.org/aaai/2006/whiteson2006aaai-sample/)BibTeX
@inproceedings{whiteson2006aaai-sample,
title = {{Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning}},
author = {Whiteson, Shimon and Stone, Peter},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2006},
pages = {518-523},
url = {https://mlanthology.org/aaai/2006/whiteson2006aaai-sample/}
}