State Abstraction for Programmable Reinforcement Learning Agents
Abstract
Safe state abstraction in reinforcement learning allows an agent to ignore aspects of its current state that are irrelevant to its current decision, and therefore speeds up dynamic programming and learning. This paper explores safe state abstraction in hierarchical reinforcement learning, where learned behaviors must conform to a given partial, hierarchical program. Unlike previous approaches to this problem, our methods yield significant state abstraction while maintaining hierarchical optimality, i.e., optimality among all policies consistent with the partial program. We show how to achieve this for a partial programming language that is essentially Lisp augmented with nondeterministic constructs. We demonstrate our methods on two variants of Dietterich's taxi domain, showing how state abstraction and hierarchical optimality result in faster learning of better policies and enable the transfer of learned skills from one problem to another.
Cite
Text
Andre and Russell. "State Abstraction for Programmable Reinforcement Learning Agents." AAAI Conference on Artificial Intelligence, 2002. doi:10.5555/777092.777114Markdown
[Andre and Russell. "State Abstraction for Programmable Reinforcement Learning Agents." AAAI Conference on Artificial Intelligence, 2002.](https://mlanthology.org/aaai/2002/andre2002aaai-state/) doi:10.5555/777092.777114BibTeX
@inproceedings{andre2002aaai-state,
title = {{State Abstraction for Programmable Reinforcement Learning Agents}},
author = {Andre, David and Russell, Stuart J.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2002},
pages = {119-125},
doi = {10.5555/777092.777114},
url = {https://mlanthology.org/aaai/2002/andre2002aaai-state/}
}