Methods for Computing State Similarity in Markov Decision Processes

Abstract

A popular approach to solving large probabilistic systems relies on aggregating states based on a measure of similarity. Many approaches in the literature are heuristic. A number of recent methods rely instead on metrics based on the notion of bisimulation, or behavioral equivalence between states (Givan et al, 2001, 2003; Ferns et al, 2004). An integral component of such metrics is the Kantorovich metric between probability distributions. However, while this metric enables many satisfying theoretical properties, it is costly to compute in practice. In this paper, we use techniques from network optimization and statistical sampling to overcome this problem. We obtain in this manner a variety of distance functions for MDP state aggregation, which differ in the tradeoff between time and space complexity, as well as the quality of the aggregation. We provide an empirical evaluation of these trade-offs.

Cite

Text

Ferns et al. "Methods for Computing State Similarity in Markov Decision Processes." Conference on Uncertainty in Artificial Intelligence, 2006.

Markdown

[Ferns et al. "Methods for Computing State Similarity in Markov Decision Processes." Conference on Uncertainty in Artificial Intelligence, 2006.](https://mlanthology.org/uai/2006/ferns2006uai-methods/)

BibTeX

@inproceedings{ferns2006uai-methods,
  title     = {{Methods for Computing State Similarity in Markov Decision Processes}},
  author    = {Ferns, Norm and Castro, Pablo Samuel and Precup, Doina and Panangaden, Prakash},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2006},
  url       = {https://mlanthology.org/uai/2006/ferns2006uai-methods/}
}