SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction

Abstract

Realistic and interactive traffic simulation is essential for training and evaluating autonomous driving systems. However, most existing data-driven simulation methods rely on static initialization or log-replay data, limiting their ability to model dynamic, long-horizon scenarios with evolving agent populations. We propose SceneStreamer, a unified autoregressive framework for continuous scenario generation that represents the entire scene as a sequence of tokens, including traffic light signals, agent states, and motion vectors, and generates them step by step with a transformer model. This design enables SceneStreamer to continuously introduce and retire agents over an unbounded horizon, supporting realistic long-duration simulation. Experiments demonstrate that SceneStreamer produces realistic, diverse, and adaptive traffic behaviors. Furthermore, reinforcement learning policies trained in SceneStreamer-generated scenarios achieve superior robustness and generalization, validating its utility as a high-fidelity simulation environment for autonomous driving. More information is available at https://vail-ucla.github.io/scenestreamer/ .

Cite

Text

Peng et al. "SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction." International Conference on Learning Representations, 2026.

Markdown

[Peng et al. "SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/peng2026iclr-scenestreamer/)

BibTeX

@inproceedings{peng2026iclr-scenestreamer,
  title     = {{SceneStreamer: Continuous Scenario Generation as Next Token Group Prediction}},
  author    = {Peng, Zhenghao and Liu, Yuxin and Zhou, Bolei},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/peng2026iclr-scenestreamer/}
}