Improving Reinforcement Learning with Confidence-Based Demonstrations

Abstract

Reinforcement learning has had many successes, but in practice it often requires significant amounts of data to learn high-performing policies. One common way to improve learning is to allow a trained (source) agent to assist a new (target) agent. The goals in this setting are to 1) improve the target agent's performance, relative to learning unaided, and 2) allow the target agent to outperform the source agent. Our approach leverages source agent demonstrations, removing any requirements on the source agent's learning algorithm or representation. The target agent then estimates the source agent's policy and improves upon it. The key contribution of this work is to show that leveraging the target agent's uncertainty in the source agent's policy can significantly improve learning in two complex simulated domains, Keepaway and Mario.

Cite

Text

Wang and Taylor. "Improving Reinforcement Learning with Confidence-Based Demonstrations." International Joint Conference on Artificial Intelligence, 2017. doi:10.24963/IJCAI.2017/422

Markdown

[Wang and Taylor. "Improving Reinforcement Learning with Confidence-Based Demonstrations." International Joint Conference on Artificial Intelligence, 2017.](https://mlanthology.org/ijcai/2017/wang2017ijcai-improving/) doi:10.24963/IJCAI.2017/422

BibTeX

@inproceedings{wang2017ijcai-improving,
  title     = {{Improving Reinforcement Learning with Confidence-Based Demonstrations}},
  author    = {Wang, Zhaodong and Taylor, Matthew E.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2017},
  pages     = {3027-3033},
  doi       = {10.24963/IJCAI.2017/422},
  url       = {https://mlanthology.org/ijcai/2017/wang2017ijcai-improving/}
}