Measuring Structural Similarities in Finite MDPs
Abstract
In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for state similarity and action similarity in a mutual reinforcement manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the proposed measures are effective.
Cite
Text
Wang et al. "Measuring Structural Similarities in Finite MDPs." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/511Markdown
[Wang et al. "Measuring Structural Similarities in Finite MDPs." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/wang2019ijcai-measuring/) doi:10.24963/IJCAI.2019/511BibTeX
@inproceedings{wang2019ijcai-measuring,
title = {{Measuring Structural Similarities in Finite MDPs}},
author = {Wang, Hao and Dong, Shaokang and Shao, Ling},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2019},
pages = {3684-3690},
doi = {10.24963/IJCAI.2019/511},
url = {https://mlanthology.org/ijcai/2019/wang2019ijcai-measuring/}
}