Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
Abstract
As robots become a mass consumer product, they will need to learn new skills by interacting with typical hu-man users. Past approaches have adapted reinforcement learning (RL) to accept a human reward signal; how-ever, we question the implicit assumption that people shall only want to give the learner feedback on its past actions. We present findings from a human user study showing that people use the reward signal not only to provide feedback about past actions, but also to pro-vide future directed rewards to guide subsequent ac-tions. Given this, we made specific modifications to the simulated RL robot to incorporate guidance. We then analyze and evaluate its learning performance in a second user study, and we report significant improve-ments on several measures. This work demonstrates the importance of understanding the human-teacher/robot-learner system as a whole in order to design algorithms that support how people want to teach while simultane-ously improving the robot’s learning performance.
Cite
Text
Thomaz and Breazeal. "Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance." AAAI Conference on Artificial Intelligence, 2006.Markdown
[Thomaz and Breazeal. "Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance." AAAI Conference on Artificial Intelligence, 2006.](https://mlanthology.org/aaai/2006/thomaz2006aaai-reinforcement/)BibTeX
@inproceedings{thomaz2006aaai-reinforcement,
title = {{Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance}},
author = {Thomaz, Andrea Lockerd and Breazeal, Cynthia},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2006},
pages = {1000-1006},
url = {https://mlanthology.org/aaai/2006/thomaz2006aaai-reinforcement/}
}