Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons
Abstract
Delayed reinforcement learning is an attractive framework for the unsupervised learning of action policies for autonomous agents. Some existing delayed reinforcement learning techniques have shown promise in simple domains. However, a number of hurdles must be passed before they are applicable to realistic problems. This paper describes one such difficulty, the input generalization problem (whereby the system must generalize to produce similar actions in similar situations) and an implemented solution, the G algorithm. This algorithm is based on recursive splitting of the state space based on statistical measures of differences in reinforcements received. Connectionist backpropagation has previously been used for input generalization in reinforcement learning. We compare the two techniques analytically and empirically. The G algorithm's sound statistical basis makes it easy to predict when it should and should not work, whereas the behavior of backpropagation is unpredictable. We found that a previous successful use of backpropagation can be explained by the linearity of the application domain. We found that in another domain, G reliably found the optimal policy, whereas none of a set of runs of backpropagation with many combinations of parameters did. 1
Cite
Text
Chapman and Kaelbling. "Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons." International Joint Conference on Artificial Intelligence, 1991.Markdown
[Chapman and Kaelbling. "Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons." International Joint Conference on Artificial Intelligence, 1991.](https://mlanthology.org/ijcai/1991/chapman1991ijcai-input/)BibTeX
@inproceedings{chapman1991ijcai-input,
title = {{Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons}},
author = {Chapman, David and Kaelbling, Leslie Pack},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {1991},
pages = {726-731},
url = {https://mlanthology.org/ijcai/1991/chapman1991ijcai-input/}
}