Distributionally Adaptive Meta Reinforcement Learning

Abstract

Meta-reinforcement learning algorithms provide a data-driven way to acquire learning algorithms that quickly adapt to many tasks with varying rewards or dynamics functions. However, learned meta-policies are often effective only on the exact task distribution on which the policy was trained, and struggle in the presence of distribution shift of test-time rewards or transition dynamics. In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks. Our framework centers on an adaptive approach to distributional robustness, in which we train a population of meta-agents to be robust to varying levels of distribution shift, so that when evaluated on a (potentially shifted) test-time distribution of tasks, we can adaptively choose the most appropriate meta-agent to follow. We formally show how this framework allows for improved regret under distribution shift, and empirically show its efficacy on simulated robotics problems under a wide range of distribution shifts.

Cite

Text

Ajay et al. "Distributionally Adaptive Meta Reinforcement Learning." ICML 2022 Workshops: DARL, 2022.

Markdown

[Ajay et al. "Distributionally Adaptive Meta Reinforcement Learning." ICML 2022 Workshops: DARL, 2022.](https://mlanthology.org/icmlw/2022/ajay2022icmlw-distributionally/)

BibTeX

@inproceedings{ajay2022icmlw-distributionally,
  title     = {{Distributionally Adaptive Meta Reinforcement Learning}},
  author    = {Ajay, Anurag and Ghosh, Dibya and Levine, Sergey and Agrawal, Pulkit and Gupta, Abhishek},
  booktitle = {ICML 2022 Workshops: DARL},
  year      = {2022},
  url       = {https://mlanthology.org/icmlw/2022/ajay2022icmlw-distributionally/}
}