Equivalence Relations in Fully and Partially Observable Markov Decision Processes

Abstract

We explore equivalence relations between states in Markov Decision Processes and Partially Observable Markov Decision Processes. We focus on two different equivalence notions: bisimulation (Givan et al., 2003) and a notion of trace equivalence, under which states are considered equivalent if they generate the same conditional probability distributions over observation sequences (where the conditioning is on action sequences). We show that the relationship between these two equivalence notions changes depending on the amount and nature of the partial observability. We also present an alternate characterization of bisimulation based on trajectory equivalence. Pablo Samuel Castro, Prakash Panangaden, Doina Precup

Cite

Text

Castro et al. "Equivalence Relations in Fully and Partially Observable Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2009.

Markdown

[Castro et al. "Equivalence Relations in Fully and Partially Observable Markov Decision Processes." International Joint Conference on Artificial Intelligence, 2009.](https://mlanthology.org/ijcai/2009/castro2009ijcai-equivalence/)

BibTeX

@inproceedings{castro2009ijcai-equivalence,
  title     = {{Equivalence Relations in Fully and Partially Observable Markov Decision Processes}},
  author    = {Castro, Pablo Samuel and Panangaden, Prakash and Precup, Doina},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2009},
  pages     = {1653-1658},
  url       = {https://mlanthology.org/ijcai/2009/castro2009ijcai-equivalence/}
}