Learning State Importance for Preference-Based Reinforcement Learning

Abstract

Preference-based reinforcement learning (PbRL) develops agents using human preferences. Due to its empirical success, it has prospect of benefiting human-centered applications. Meanwhile, previous work on PbRL overlooks interpretability, which is an indispensable element of ethical artificial intelligence (AI). While prior art for explainable AI offers some machinery, there lacks an approach to select samples to construct explanations. This becomes an issue for PbRL, as transitions relevant to task solving are often outnumbered by irrelevant ones. Thus, ad-hoc sample selection undermines the credibility of explanations. The present study proposes a framework for learning reward functions and state importance from preferences simultaneously. It offers a systematic approach for selecting samples when constructing explanations. Moreover, the present study proposes a perturbation analysis to evaluate the learned state importance quantitatively. Through experiments on discrete and continuous control tasks, the present study demonstrates the proposed framework’s efficacy for providing interpretability without sacrificing task performance.

Cite

Text

Zhang and Kashima. "Learning State Importance for Preference-Based Reinforcement Learning." Machine Learning, 2024. doi:10.1007/S10994-022-06295-5

Markdown

[Zhang and Kashima. "Learning State Importance for Preference-Based Reinforcement Learning." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/zhang2024mlj-learning/) doi:10.1007/S10994-022-06295-5

BibTeX

@article{zhang2024mlj-learning,
  title     = {{Learning State Importance for Preference-Based Reinforcement Learning}},
  author    = {Zhang, Guoxi and Kashima, Hisashi},
  journal   = {Machine Learning},
  year      = {2024},
  pages     = {1885-1901},
  doi       = {10.1007/S10994-022-06295-5},
  volume    = {113},
  url       = {https://mlanthology.org/mlj/2024/zhang2024mlj-learning/}
}