Emergent Tangled Program Graphs in Multi-Task Learning

Abstract

We propose a Genetic Programming (GP) framework to address high-dimensional Multi-Task Reinforcement Learning (MTRL) through emergent modularity. A bottom-up process is assumed in which multiple programs self-organize into collective decision-making entities, or teams, which then further develop into multi-team policy graphs, or Tangled Program Graphs (TPG). The framework learns to play three Atari video games simultaneously, producing a single control policy that matches or exceeds leading results from (game-specific) deep reinforcement learning in each game. More importantly, unlike the representation assumed for deep learning, TPG policies start simple and adaptively complexify through interaction with the task environment, resulting in agents that are exceedingly simple, operating in real-time without specialized hardware support such as GPUs.

Cite

Text

Kelly and Heywood. "Emergent Tangled Program Graphs in Multi-Task Learning." International Joint Conference on Artificial Intelligence, 2018. doi:10.24963/IJCAI.2018/740

Markdown

[Kelly and Heywood. "Emergent Tangled Program Graphs in Multi-Task Learning." International Joint Conference on Artificial Intelligence, 2018.](https://mlanthology.org/ijcai/2018/kelly2018ijcai-emergent/) doi:10.24963/IJCAI.2018/740

BibTeX

@inproceedings{kelly2018ijcai-emergent,
  title     = {{Emergent Tangled Program Graphs in Multi-Task Learning}},
  author    = {Kelly, Stephen and Heywood, Malcolm I.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2018},
  pages     = {5294-5298},
  doi       = {10.24963/IJCAI.2018/740},
  url       = {https://mlanthology.org/ijcai/2018/kelly2018ijcai-emergent/}
}