Probabilistic Constrained Reinforcement Learning with Formal Interpretability

ICML 2024 pp. 51303-51327

/icml/2024/wang2024icml-probabilistic/

Abstract

Reinforcement learning can provide effective reasoning for sequential decision-making problems with variable dynamics. Such reasoning in practical implementation, however, poses a persistent challenge in interpreting the reward function and the corresponding optimal policy. Consequently, representing sequential decision-making problems as probabilistic inference can have considerable value, as, in principle, the inference offers diverse and powerful mathematical tools to infer the stochastic dynamics whilst suggesting a probabilistic interpretation of policy optimization. In this study, we propose a novel Adaptive Wasserstein Variational Optimization, namely AWaVO, to tackle these interpretability challenges. Our approach uses formal methods to achieve the interpretability: convergence guarantee, training transparency, and intrinsic decision-interpretation. To demonstrate its practicality, we showcase guaranteed interpretability including a global convergence rate $\Theta(1/\sqrt{T})$ not only in simulation but also in real-world quadrotor tasks. In comparison with state-of-the-art benchmarks, including TRPO-IPO, PCPO, and CRPO, we empirically verify that AWaVO offers a reasonable trade-off between high performance and sufficient interpretability.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Wang et al. "Probabilistic Constrained Reinforcement Learning with Formal Interpretability." International Conference on Machine Learning, 2024.

Markdown

[Wang et al. "Probabilistic Constrained Reinforcement Learning with Formal Interpretability." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/wang2024icml-probabilistic/)

BibTeX

@inproceedings{wang2024icml-probabilistic,
  title     = {{Probabilistic Constrained Reinforcement Learning with Formal Interpretability}},
  author    = {Wang, Yanran and Qian, Qiuchen and Boyle, David},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {51303-51327},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/wang2024icml-probabilistic/}
}