Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization

Abstract

Recent progress in offline reinforcement learning (RL) has shown that it is often possible to train strong agents without potentially unsafe or impractical online interaction. However, in real-world settings, agents may encounter unseen environments with different dynamics, and generalization ability is required. This work presents Dynamics-Augmented Decision Transformer (DADT), a simple yet efficient method to train generalizable agents from offline datasets; on top of return-conditioned policy using the transformer architecture, we improve generalization capabilities by using representation learning based on next state prediction. Our experimental results demonstrate that DADT outperforms prior state-of-the-art methods for offline dynamics generalization. Intriguingly, DADT without fine-tuning even outperforms fine-tuned baselines.

Cite

Text

Kim et al. "Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization." NeurIPS 2022 Workshops: Offline_RL, 2022.

Markdown

[Kim et al. "Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization." NeurIPS 2022 Workshops: Offline_RL, 2022.](https://mlanthology.org/neuripsw/2022/kim2022neuripsw-dynamicsaugmented/)

BibTeX

@inproceedings{kim2022neuripsw-dynamicsaugmented,
  title     = {{Dynamics-Augmented Decision Transformer for Offline Dynamics Generalization}},
  author    = {Kim, Changyeon and Kim, Junsu and Seo, Younggyo and Lee, Kimin and Lee, Honglak and Shin, Jinwoo},
  booktitle = {NeurIPS 2022 Workshops: Offline_RL},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/kim2022neuripsw-dynamicsaugmented/}
}