Prompting Decision Transformer for Few-Shot Policy Generalization

Abstract

Human can leverage prior experience and learn novel tasks from a handful of demonstrations. In contrast to offline meta-reinforcement learning, which aims to achieve quick adaptation through better algorithm design, we investigate the effect of architecture inductive bias on the few-shot learning capability. We propose a Prompt-based Decision Transformer (Prompt-DT), which leverages the sequential modeling ability of the Transformer architecture and the prompt framework to achieve few-shot adaptation in offline RL. We design the trajectory prompt, which contains segments of the few-shot demonstrations, and encodes task-specific information to guide policy generation. Our experiments in five MuJoCo control benchmarks show that Prompt-DT is a strong few-shot learner without any extra finetuning on unseen target tasks. Prompt-DT outperforms its variants and strong meta offline RL baselines by a large margin with a trajectory prompt containing only a few timesteps. Prompt-DT is also robust to prompt length changes and can generalize to out-of-distribution (OOD) environments. Project page: \href{https://mxu34.github.io/PromptDT/}https://mxu34.github.io/PromptDT/.

Cite

Text

Xu et al. "Prompting Decision Transformer for Few-Shot Policy Generalization." International Conference on Machine Learning, 2022.

Markdown

[Xu et al. "Prompting Decision Transformer for Few-Shot Policy Generalization." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/xu2022icml-prompting/)

BibTeX

@inproceedings{xu2022icml-prompting,
  title     = {{Prompting Decision Transformer for Few-Shot Policy Generalization}},
  author    = {Xu, Mengdi and Shen, Yikang and Zhang, Shun and Lu, Yuchen and Zhao, Ding and Tenenbaum, Joshua and Gan, Chuang},
  booktitle = {International Conference on Machine Learning},
  year      = {2022},
  pages     = {24631-24645},
  volume    = {162},
  url       = {https://mlanthology.org/icml/2022/xu2022icml-prompting/}
}