A Kernel Perspective on Behavioural Metrics for Markov Decision Processes
Abstract
We present a novel perspective on behavioural metrics for Markov decision processes via the use of positive definite kernels. We define a new metric under this lens that is provably equivalent to the recently introduced MICo distance (Castro et al., 2021). The kernel perspective enables us to provide new theoretical results, including value-function bounds and low-distortion finite-dimensional Euclidean embeddings, which are crucial when using behavioural metrics for reinforcement learning representations. We complement our theory with strong empirical results that demonstrate the effectiveness of these methods in practice.
Cite
Text
Castro et al. "A Kernel Perspective on Behavioural Metrics for Markov Decision Processes." Transactions on Machine Learning Research, 2023.Markdown
[Castro et al. "A Kernel Perspective on Behavioural Metrics for Markov Decision Processes." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/castro2023tmlr-kernel/)BibTeX
@article{castro2023tmlr-kernel,
title = {{A Kernel Perspective on Behavioural Metrics for Markov Decision Processes}},
author = {Castro, Pablo Samuel and Kastner, Tyler and Panangaden, Prakash and Rowland, Mark},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/castro2023tmlr-kernel/}
}