Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks

Grewal, Yashvir Singh; de Nijs, Frits; Goodwin, Sarah

Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks

Yashvir Singh Grewal, Frits de Nijs, Sarah Goodwin

NeurIPSW 2021

/neuripsw/2021/grewal2021neuripsw-varianceseeking/

Abstract

Meta-Reinforcement Learning (meta-RL) yields the potential to improve the sample efficiency of reinforcement learning algorithms. Through training an agent on multiple meta-RL tasks, the agent is able to learn a policy based on past experience, and leverage this to solve new, unseen tasks. Accordingly, meta-RL promises to solve real-world problems, such as real-time heating, ventilation and air-conditioning(HVAC) control without accurate simulators of the target building. In this paper, we propose a meta-RL method which trains an agent on first order models to efficiently learn and adapt to the internal dynamics of a real-world building. We recognise that meta-agents trained on first order simulator models do not perform well on second order models, owing to the meta-RL assumption that the test tasks should be from within the same distribution as the training tasks. In response, we propose a novel exploration method called variance seeking meta-exploration which enables a meta-RL agent to perform well on complex tasks outside of its training distribution. Our method programs the agent to prefer exploring task dependent state-action pairs, and in turn, allows it to adapt efficiently to challenging second order models which bear greater semblance to real-world problems

PDF NeurIPSW OpenReview Semantic Scholar

Cite

Text

Grewal et al. "Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks." NeurIPS 2021 Workshops: DeepRL, 2021.

Markdown

[Grewal et al. "Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks." NeurIPS 2021 Workshops: DeepRL, 2021.](https://mlanthology.org/neuripsw/2021/grewal2021neuripsw-varianceseeking/)

BibTeX

@inproceedings{grewal2021neuripsw-varianceseeking,
  title     = {{Variance-Seeking Meta-Exploration to Handle Out-of-Distribution Tasks}},
  author    = {Grewal, Yashvir Singh and de Nijs, Frits and Goodwin, Sarah},
  booktitle = {NeurIPS 2021 Workshops: DeepRL},
  year      = {2021},
  url       = {https://mlanthology.org/neuripsw/2021/grewal2021neuripsw-varianceseeking/}
}