Measuring Structural Similarities in Finite MDPs

Abstract

In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for state similarity and action similarity in a mutual reinforcement manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the proposed measures are effective.

Cite

Text

Wang et al. "Measuring Structural Similarities in Finite MDPs." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/511

Markdown

[Wang et al. "Measuring Structural Similarities in Finite MDPs." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/wang2019ijcai-measuring/) doi:10.24963/IJCAI.2019/511

BibTeX

@inproceedings{wang2019ijcai-measuring,
  title     = {{Measuring Structural Similarities in Finite MDPs}},
  author    = {Wang, Hao and Dong, Shaokang and Shao, Ling},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {3684-3690},
  doi       = {10.24963/IJCAI.2019/511},
  url       = {https://mlanthology.org/ijcai/2019/wang2019ijcai-measuring/}
}