Online Multi-Task Learning for Policy Gradient Methods

Abstract

Policy gradient algorithms have shown considerable recent success in solving high-dimensional sequential decision making tasks, particularly in robotics. However, these methods often require extensive experience in a domain to achieve high performance. To make agents more sample-efficient, we developed a multi-task policy gradient method to learn decision making tasks consecutively, transferring knowledge between tasks to accelerate learning. Our approach provides robust theoretical guarantees, and we show empirically that it dramatically accelerates learning on a variety of dynamical systems, including an application to quadrotor control.

Cite

Text

Ammar et al. "Online Multi-Task Learning for Policy Gradient Methods." International Conference on Machine Learning, 2014.

Markdown

[Ammar et al. "Online Multi-Task Learning for Policy Gradient Methods." International Conference on Machine Learning, 2014.](https://mlanthology.org/icml/2014/ammar2014icml-online/)

BibTeX

@inproceedings{ammar2014icml-online,
  title     = {{Online Multi-Task Learning for Policy Gradient Methods}},
  author    = {Ammar, Haitham Bou and Eaton, Eric and Ruvolo, Paul and Taylor, Matthew},
  booktitle = {International Conference on Machine Learning},
  year      = {2014},
  pages     = {1206-1214},
  volume    = {32},
  url       = {https://mlanthology.org/icml/2014/ammar2014icml-online/}
}