Prompt Tuning Decision Transformers with Structured and Scalable Bandits

Finn Rietz, Oleg Smirnov, Sara Karimi, Lele Cao

NeurIPS 2025

/neurips/2025/rietz2025neurips-prompt/

Abstract

Prompt tuning has emerged as a key technique for adapting large pre-trained Decision Transformers (DTs) in offline Reinforcement Learning (RL), particularly in multi-task and few-shot settings. The Prompting Decision Transformer (PDT) enables task generalization via trajectory prompts sampled uniformly from expert demonstrations -- without accounting for prompt informativeness. In this work, we propose a bandit-based prompt-tuning method that learns to construct optimal trajectory prompts from demonstration data at inference time. We devise a structured bandit architecture operating in the trajectory prompt space, achieving linear rather than combinatorial scaling with prompt size. Additionally, we show that the pre-trained PDT itself can serve as a powerful feature extractor for the bandit, enabling efficient reward modeling across various environments. We theoretically establish regret bounds and demonstrate empirically that our method consistently enhances performance across a wide range of tasks, high-dimensional environments, and out-of-distribution scenarios, outperforming existing baselines in prompt tuning.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Rietz et al. "Prompt Tuning Decision Transformers with Structured and Scalable Bandits." Advances in Neural Information Processing Systems, 2025.

Markdown

[Rietz et al. "Prompt Tuning Decision Transformers with Structured and Scalable Bandits." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/rietz2025neurips-prompt/)

BibTeX

@inproceedings{rietz2025neurips-prompt,
  title     = {{Prompt Tuning Decision Transformers with Structured and Scalable Bandits}},
  author    = {Rietz, Finn and Smirnov, Oleg and Karimi, Sara and Cao, Lele},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/rietz2025neurips-prompt/}
}