Prompts and Pre-Trained Language Models for Offline Reinforcement Learning

Denis Tarasov, Vladislav Kurenkov, Sergey Kolesnikov

ICLRW 2022

/iclrw/2022/tarasov2022iclrw-prompts/

Abstract

In this preliminary study, we introduce a simple way to leverage pre-trained language models in deep offline RL settings that are not naturally suited for textual representation. We propose using a state transformation into a human-readable text and a minimal fine-tuning of the pre-trained language model when training with deep offline RL algorithms. This approach shows consistent performance gains on the NeoRL MuJoCo datasets. Our experiments suggest that LM fine-tuning is crucial for good performance on robotics tasks. However, we also show that it is not necessary when working with finance environments in order to retain significant improvement in the final performance.

PDF ICLRW OpenReview Semantic Scholar

Cite

Text

Tarasov et al. "Prompts and Pre-Trained Language Models for Offline Reinforcement Learning." ICLR 2022 Workshops: GPL, 2022.

Markdown

[Tarasov et al. "Prompts and Pre-Trained Language Models for Offline Reinforcement Learning." ICLR 2022 Workshops: GPL, 2022.](https://mlanthology.org/iclrw/2022/tarasov2022iclrw-prompts/)

BibTeX

@inproceedings{tarasov2022iclrw-prompts,
  title     = {{Prompts and Pre-Trained Language Models for Offline Reinforcement Learning}},
  author    = {Tarasov, Denis and Kurenkov, Vladislav and Kolesnikov, Sergey},
  booktitle = {ICLR 2022 Workshops: GPL},
  year      = {2022},
  url       = {https://mlanthology.org/iclrw/2022/tarasov2022iclrw-prompts/}
}