Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach

Abstract

We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknown distribution. We model the distribution over MDPs using a hierarchical Bayesian infinite mixture model. For each novel MDP, we use the previously learned distribution as an informed prior for modelbased Bayesian reinforcement learning. The hierarchical Bayesian framework provides a strong prior that allows us to rapidly infer the characteristics of new environments based on previous environments, while the use of a nonparametric model allows us to quickly adapt to environments we have not encountered before. In addition, the use of infinite mixtures allows for the model to automatically learn the number of underlying MDP components. We evaluate our approach and show that it leads to significant speedups in convergence to an optimal policy after observing only a small number of tasks.

Cite

Text

Wilson et al. "Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach." International Conference on Machine Learning, 2007. doi:10.1145/1273496.1273624

Markdown

[Wilson et al. "Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach." International Conference on Machine Learning, 2007.](https://mlanthology.org/icml/2007/wilson2007icml-multi/) doi:10.1145/1273496.1273624

BibTeX

@inproceedings{wilson2007icml-multi,
  title     = {{Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach}},
  author    = {Wilson, Aaron and Fern, Alan and Ray, Soumya and Tadepalli, Prasad},
  booktitle = {International Conference on Machine Learning},
  year      = {2007},
  pages     = {1015-1022},
  doi       = {10.1145/1273496.1273624},
  url       = {https://mlanthology.org/icml/2007/wilson2007icml-multi/}
}