Nonapproximability Results for Partially Observable Markov Decision Processes
Abstract
We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P = NP, P = PSPACE, or P = EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation.
Cite
Text
Lusena et al. "Nonapproximability Results for Partially Observable Markov Decision Processes." Journal of Artificial Intelligence Research, 2001. doi:10.1613/JAIR.714Markdown
[Lusena et al. "Nonapproximability Results for Partially Observable Markov Decision Processes." Journal of Artificial Intelligence Research, 2001.](https://mlanthology.org/jair/2001/lusena2001jair-nonapproximability/) doi:10.1613/JAIR.714BibTeX
@article{lusena2001jair-nonapproximability,
title = {{Nonapproximability Results for Partially Observable Markov Decision Processes}},
author = {Lusena, Christopher and Goldsmith, Judy and Mundhenk, Martin},
journal = {Journal of Artificial Intelligence Research},
year = {2001},
pages = {83-103},
doi = {10.1613/JAIR.714},
volume = {14},
url = {https://mlanthology.org/jair/2001/lusena2001jair-nonapproximability/}
}