Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance

Thomaz, Andrea Lockerd; Breazeal, Cynthia

Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance

AAAI 2006 pp. 1000-1006

/aaai/2006/thomaz2006aaai-reinforcement/

Abstract

As robots become a mass consumer product, they will need to learn new skills by interacting with typical hu-man users. Past approaches have adapted reinforcement learning (RL) to accept a human reward signal; how-ever, we question the implicit assumption that people shall only want to give the learner feedback on its past actions. We present findings from a human user study showing that people use the reward signal not only to provide feedback about past actions, but also to pro-vide future directed rewards to guide subsequent ac-tions. Given this, we made specific modifications to the simulated RL robot to incorporate guidance. We then analyze and evaluate its learning performance in a second user study, and we report significant improve-ments on several measures. This work demonstrates the importance of understanding the human-teacher/robot-learner system as a whole in order to design algorithms that support how people want to teach while simultane-ously improving the robot’s learning performance.

PDF AAAI Semantic Scholar

Cite

Text

Thomaz and Breazeal. "Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance." AAAI Conference on Artificial Intelligence, 2006.

Markdown

[Thomaz and Breazeal. "Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance." AAAI Conference on Artificial Intelligence, 2006.](https://mlanthology.org/aaai/2006/thomaz2006aaai-reinforcement/)

BibTeX

@inproceedings{thomaz2006aaai-reinforcement,
  title     = {{Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance}},
  author    = {Thomaz, Andrea Lockerd and Breazeal, Cynthia},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2006},
  pages     = {1000-1006},
  url       = {https://mlanthology.org/aaai/2006/thomaz2006aaai-reinforcement/}
}