Approximation Algorithms for Solving Cost Observable Markov Decision Processes
Abstract
Partial observability is a result of noisy or imperfect sensors that are not able to reveal the real state of the world. Problems that suffer from partial observability have been modelled as Partially Observable Markov Decision Processes (POMDPs). They have been studied by researchers in Operations Research and Artificial Intelligence for the past 30 years. Nevertheless, solving for the optimal solution or for close approximations to the optimal solution is known to be NP-hard. Current algorithms are very expensive and do not scale well. Many applications can be modelled as POMDPs: quality control, autonomous robots, weapon allocation, medical diagnosis. Designing approximation algorithms to solve problems that have partial observability is the focus of this research. The model we propose (Cost Observable Markov Decision Processes or COMDPs) associates costs with obtaining information about the current state (Bayer 1998). The COMDP’s actions are of two kinds: world actions and observation actions. World actions change the state of the world, but return no observation information. Observation actions return observation information, but the state of the world does not change while performing them. COMDPs are intended to model situations that arise in diagnosis and active vision where there are many observation actions that do not change the world and relatively few world-changing actions. We are particularly interested in problems where there are many alternative sensing actions (including, especially, no sensing at all) and where, if all observation actions are performed, the entire state of the world is observable (but presumably at very great cost). Hence, COMDPs can also be viewed as a form of fully observable Markov Decision Processes (MDPs) where the agent must pay to receive state information (they are cost-observable). Any POMDP can be modeled as a COMDP and vice versa. We want to approximately solve COMDPs and determine what POMDP classes these approximations are good for. There are two fundamental problems that any POMDP approximation algorithm must address: how to act when lost and how to avoid getting lost.
Cite
Text
Bayer. "Approximation Algorithms for Solving Cost Observable Markov Decision Processes." AAAI Conference on Artificial Intelligence, 1999.Markdown
[Bayer. "Approximation Algorithms for Solving Cost Observable Markov Decision Processes." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/bayer1999aaai-approximation/)BibTeX
@inproceedings{bayer1999aaai-approximation,
title = {{Approximation Algorithms for Solving Cost Observable Markov Decision Processes}},
author = {Bayer, Valentina},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1999},
pages = {941},
url = {https://mlanthology.org/aaai/1999/bayer1999aaai-approximation/}
}