Recursively-Constrained Partially Observable Markov Decision Processes
Abstract
Many sequential decision problems involve optimizing one objective function while imposing constraints on other objectives. Constrained Partially Observable Markov Decision Processes (C-POMDP) model this case with transition uncertainty and partial observability. In this work, we first show that C-POMDPs violate the optimal substructure property over successive decision steps and thus may exhibit behaviors that are undesirable for some (e.g., safety critical) applications. Additionally, online re-planning in C-POMDPs is often ineffective due to the inconsistency resulting from this violation. To address these drawbacks, we introduce the Recursively-Constrained POMDP (RC-POMDP), which imposes additional history-dependent cost constraints on the C-POMDP. We show that, unlike C-POMDPs, RC-POMDPs always have deterministic optimal policies and that optimal policies obey Bellman’s principle of optimality. We also present a point-based dynamic programming algorithm for RC-POMDPs. Evaluations on benchmark problems demonstrate the efficacy of our algorithm and show that policies for RC-POMDPs produce more desirable behaviors than policies for C-POMDPs.
Cite
Text
Ho et al. "Recursively-Constrained Partially Observable Markov Decision Processes." Uncertainty in Artificial Intelligence, 2024.Markdown
[Ho et al. "Recursively-Constrained Partially Observable Markov Decision Processes." Uncertainty in Artificial Intelligence, 2024.](https://mlanthology.org/uai/2024/ho2024uai-recursivelyconstrained/)BibTeX
@inproceedings{ho2024uai-recursivelyconstrained,
title = {{Recursively-Constrained Partially Observable Markov Decision Processes}},
author = {Ho, Qi Heng and Becker, Tyler and Kraske, Benjamin and Laouar, Zakariya and Feather, Martin and Rossi, Federico and Lahijanian, Morteza and Sunberg, Zachary},
booktitle = {Uncertainty in Artificial Intelligence},
year = {2024},
pages = {1658-1680},
volume = {244},
url = {https://mlanthology.org/uai/2024/ho2024uai-recursivelyconstrained/}
}