Learning to Multi-Task by Active Sampling

Abstract

One of the long-standing challenges in Artificial Intelligence for learning goal-directed behavior is to build a single agent which can solve multiple tasks. Recent progress in multi-task learning for goal-directed sequential problems has been in the form of distillation based learning wherein a student network learns from multiple task-specific expert networks by mimicking the task-specific policies of the expert networks. While such approaches offer a promising solution to the multi-task learning problem, they require supervision from large expert networks which require extensive data and computation time for training. In this work, we propose an efficient multi-task learning framework which solves multiple goal-directed tasks in an on-line setup without the need for expert supervision. Our work uses active learning principles to achieve multi-task learning by sampling the harder tasks more than the easier ones. We propose three distinct models under our active sampling framework. An adaptive method with extremely competitive multi-tasking performance. A UCB-based meta-learner which casts the problem of picking the next task to train on as a multi-armed bandit problem. A meta-learning method that casts the next-task picking problem as a full Reinforcement Learning problem and uses actor-critic methods for optimizing the multi-tasking performance directly. We demonstrate results in the Atari 2600 domain on seven multi-tasking instances: three 6-task instances, one 8-task instance, two 12-task instances and one 21-task instance.

Cite

Text

Sharma et al. "Learning to Multi-Task by Active Sampling." International Conference on Learning Representations, 2018.

Markdown

[Sharma et al. "Learning to Multi-Task by Active Sampling." International Conference on Learning Representations, 2018.](https://mlanthology.org/iclr/2018/sharma2018iclr-learning/)

BibTeX

@inproceedings{sharma2018iclr-learning,
  title     = {{Learning to Multi-Task by Active Sampling}},
  author    = {Sharma, Sahil and Jha, Ashutosh Kumar and Hegde, Parikshit S and Ravindran, Balaraman},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  url       = {https://mlanthology.org/iclr/2018/sharma2018iclr-learning/}
}