Indefinite-Horizon POMDPs with Action-Based Termination
Abstract
For decision-theoretic planning problems with an indefinite horizon, plan execution terminates after a finite number of steps with probability one, but the number of steps until termination (i.e., the horizon) is uncertain and unbounded. In the traditional approach to modeling such problems, called a stochastic shortest-path problem, plan execution terminates when a particular state is reached, typically a goal state. We consider a model in which plan execution terminates when a stopping action is taken. We show that an action-based model of termination has several advantages for partially observable planning problems. It does not require a goal state to be fully observable; it does not require achievement of a goal state to be guaranteed; and it allows a proper policy to be found more easily. This framework allows many partially observable planning problems to be modeled in a more realistic way that does not require an artificial discount factor.
Cite
Text
Hansen. "Indefinite-Horizon POMDPs with Action-Based Termination." AAAI Conference on Artificial Intelligence, 2007.Markdown
[Hansen. "Indefinite-Horizon POMDPs with Action-Based Termination." AAAI Conference on Artificial Intelligence, 2007.](https://mlanthology.org/aaai/2007/hansen2007aaai-indefinite/)BibTeX
@inproceedings{hansen2007aaai-indefinite,
title = {{Indefinite-Horizon POMDPs with Action-Based Termination}},
author = {Hansen, Eric A.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2007},
pages = {1237-1242},
url = {https://mlanthology.org/aaai/2007/hansen2007aaai-indefinite/}
}