The Size of MDP Factored Policies

Liberatore, Paolo

doi:10.5555/777092.777136

The Size of MDP Factored Policies

Paolo Liberatore

AAAI 2002 pp. 267-272

doi:10.5555/777092.777136 /aaai/2002/liberatore2002aaai-size/

Abstract

Policies of Markov Decision Processes (MDPs) tell the next action to execute, given the current state and (possibly) the history of actions executed so far. Factorization is used when the number of states is exponentially large: both the MDP and the policy can be then represented using a compact form, for example employing circuits. We prove that there are MDPs whose optimal policies require exponential space evenin factored form.

PDF AAAI Semantic Scholar

Cite

Text

Liberatore. "The Size of MDP Factored Policies." AAAI Conference on Artificial Intelligence, 2002. doi:10.5555/777092.777136

Markdown

[Liberatore. "The Size of MDP Factored Policies." AAAI Conference on Artificial Intelligence, 2002.](https://mlanthology.org/aaai/2002/liberatore2002aaai-size/) doi:10.5555/777092.777136

BibTeX

@inproceedings{liberatore2002aaai-size,
  title     = {{The Size of MDP Factored Policies}},
  author    = {Liberatore, Paolo},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2002},
  pages     = {267-272},
  doi       = {10.5555/777092.777136},
  url       = {https://mlanthology.org/aaai/2002/liberatore2002aaai-size/}
}