Online Multi-Task Gradient Temporal-Difference Learning

Abstract

We develop an online multi-task formulation of model-based gradient temporal-difference (GTD) reinforcement learning. Our approach enables an autonomous RL agent to accumulate knowledge over its lifetime and efficiently share this knowledge between tasks to accelerate learning. Rather than learning a policy for a reinforcement learning task tabula rasa, as in standard GTD, our approach rapidly learns a high performance policy by building upon the agent's previously learned knowledge. Our preliminary results on controlling different mountain car tasks demonstrates that GTD-ELLA significantly improves learning over standard GTD(0).

Cite

Text

Sreenivasan et al. "Online Multi-Task Gradient Temporal-Difference Learning." AAAI Conference on Artificial Intelligence, 2014. doi:10.1609/AAAI.V28I1.9106

Markdown

[Sreenivasan et al. "Online Multi-Task Gradient Temporal-Difference Learning." AAAI Conference on Artificial Intelligence, 2014.](https://mlanthology.org/aaai/2014/sreenivasan2014aaai-online/) doi:10.1609/AAAI.V28I1.9106

BibTeX

@inproceedings{sreenivasan2014aaai-online,
  title     = {{Online Multi-Task Gradient Temporal-Difference Learning}},
  author    = {Sreenivasan, Vishnu Purushothaman and Bou-Ammar, Haitham and Eaton, Eric},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2014},
  pages     = {3136-3137},
  doi       = {10.1609/AAAI.V28I1.9106},
  url       = {https://mlanthology.org/aaai/2014/sreenivasan2014aaai-online/}
}