Online Multi-Task Learning for Policy Gradient Methods
Abstract
Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.
Cite
Text
Ammar et al. "Online Multi-Task Learning for Policy Gradient Methods." International Conference on Machine Learning, 2014.Markdown
[Ammar et al. "Online Multi-Task Learning for Policy Gradient Methods." International Conference on Machine Learning, 2014.](https://mlanthology.org/icml/2014/ammar2014icml-online/)BibTeX
@inproceedings{ammar2014icml-online,
title = {{Online Multi-Task Learning for Policy Gradient Methods}},
author = {Ammar, Haitham Bou and Eaton, Eric and Ruvolo, Paul and Taylor, Matthew},
booktitle = {International Conference on Machine Learning},
year = {2014},
pages = {1206-1214},
volume = {32},
url = {https://mlanthology.org/icml/2014/ammar2014icml-online/}
}