Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State
Abstract
We present Utile Suffix Memory, a reinforcement learning algorithm that uses short-term memory to overcome the state aliasing that results from hidden state. By combining the advantages of previous work in instance-based (or “memory- based”) learning and previous work with statistical tests for separating noise from task structure, the method learns quickly, creates only as much memory as needed for the task at hand, and handles noise well. Utile Suffix Memory uses a tree-structured representation, and is related to work on Prediction Suffix Trees [Ron et al., 1994], Parti-game [Moore, 1993], G-algorithm [Chapman and Kaelbling, 1991], and Variable Resolution Dynamic Programming [Moore, 1991].
Cite
Text
McCallum. "Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State." International Conference on Machine Learning, 1995. doi:10.1016/B978-1-55860-377-6.50055-4Markdown
[McCallum. "Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State." International Conference on Machine Learning, 1995.](https://mlanthology.org/icml/1995/mccallum1995icml-instance/) doi:10.1016/B978-1-55860-377-6.50055-4BibTeX
@inproceedings{mccallum1995icml-instance,
title = {{Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State}},
author = {McCallum, R. Andrew},
booktitle = {International Conference on Machine Learning},
year = {1995},
pages = {387-395},
doi = {10.1016/B978-1-55860-377-6.50055-4},
url = {https://mlanthology.org/icml/1995/mccallum1995icml-instance/}
}