Predictable MDP Abstraction for Unsupervised Model-Based RL
Abstract
A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions. Errors in this predictive model can degrade the performance of model-based controllers, and complex Markov decision processes (MDPs) can present exceptionally difficult prediction problems. To mitigate this issue, we propose predictable MDP abstraction (PMA): instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space that only permits predictable, easy-to-model actions, while covering the original state-action space as much as possible. As a result, model learning becomes easier and more accurate, which allows robust, stable model-based planning or model-based RL. This transformation is learned in an unsupervised manner, before any task is specified by the user. Downstream tasks can then be solved with model-based control in a zero-shot fashion, without additional environment interactions. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches in a range of benchmark environments. Our code and videos are available at https://seohong.me/projects/pma/
Cite
Text
Park and Levine. "Predictable MDP Abstraction for Unsupervised Model-Based RL." International Conference on Machine Learning, 2023.Markdown
[Park and Levine. "Predictable MDP Abstraction for Unsupervised Model-Based RL." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/park2023icml-predictable/)BibTeX
@inproceedings{park2023icml-predictable,
title = {{Predictable MDP Abstraction for Unsupervised Model-Based RL}},
author = {Park, Seohong and Levine, Sergey},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {27246-27268},
volume = {202},
url = {https://mlanthology.org/icml/2023/park2023icml-predictable/}
}