Indefinite-Horizon POMDPs with Action-Based Termination

Abstract

For decision-theoretic planning problems with an indefinite horizon, plan execution terminates after a finite number of steps with probability one, but the number of steps until termination (i.e., the horizon) is uncertain and unbounded. In the traditional approach to modeling such problems, called a stochastic shortest-path problem, plan execution terminates when a particular state is reached, typically a goal state. We consider a model in which plan execution terminates when a stopping action is taken. We show that an action-based model of termination has several advantages for partially observable planning problems. It does not require a goal state to be fully observable; it does not require achievement of a goal state to be guaranteed; and it allows a proper policy to be found more easily. This framework allows many partially observable planning problems to be modeled in a more realistic way that does not require an artificial discount factor.

Cite

Text

Hansen. "Indefinite-Horizon POMDPs with Action-Based Termination." AAAI Conference on Artificial Intelligence, 2007.

Markdown

[Hansen. "Indefinite-Horizon POMDPs with Action-Based Termination." AAAI Conference on Artificial Intelligence, 2007.](https://mlanthology.org/aaai/2007/hansen2007aaai-indefinite/)

BibTeX

@inproceedings{hansen2007aaai-indefinite,
  title     = {{Indefinite-Horizon POMDPs with Action-Based Termination}},
  author    = {Hansen, Eric A.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2007},
  pages     = {1237-1242},
  url       = {https://mlanthology.org/aaai/2007/hansen2007aaai-indefinite/}
}