Rating-Based Reinforcement Learning
Abstract
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning. Different from the existing preference-based and ranking-based reinforcement learning paradigms, based on human relative preferences over sample pairs, the proposed rating-based reinforcement learning approach is based on human evaluation of individual trajectories without relative comparisons between sample pairs. The rating-based reinforcement learning approach builds on a new prediction model for human ratings and a novel multi-class loss function. We conduct several experimental studies based on synthetic ratings and real human ratings to evaluate the effectiveness and benefits of the new rating-based reinforcement learning approach.
Cite
Text
White et al. "Rating-Based Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I9.28886Markdown
[White et al. "Rating-Based Reinforcement Learning." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/white2024aaai-rating/) doi:10.1609/AAAI.V38I9.28886BibTeX
@inproceedings{white2024aaai-rating,
title = {{Rating-Based Reinforcement Learning}},
author = {White, Devin and Wu, Mingkang and Novoseller, Ellen R. and Lawhern, Vernon J. and Waytowich, Nicholas R. and Cao, Yongcan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {10207-10215},
doi = {10.1609/AAAI.V38I9.28886},
url = {https://mlanthology.org/aaai/2024/white2024aaai-rating/}
}