Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

Abstract

Traffic simulation aims to learn a policy for traffic agents that, when unrolled in closed-loop, faithfully recovers the joint distribution of trajectories observed in the real world. Inspired by large language models, tokenized multi-agent policies have recently become the state-of-the-art in traffic simulation. However, they are typically trained through open-loop behavior cloning, and thus suffer from covariate shift when executed in closed-loop during simulation. In this work, we present Closest Among Top-K (CAT-K) rollouts, a simple yet effective closed-loop fine-tuning strategy to mitigate covariate shift. CAT-K fine-tuning only requires existing trajectory data, without reinforcement learning or generative adversarial imitation. Concretely, CAT-K fine-tuning enables a small 7M-parameter tokenized traffic simulation policy to outperform a 102M-parameter model from the same model family, achieving the top spot on the Waymo Sim Agent Challenge leaderboard at the time of submission. The code is available at https://github.com/NVlabs/catk.

Cite

Text

Zhang et al. "Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00510

Markdown

[Zhang et al. "Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/zhang2025cvpr-closedloop/) doi:10.1109/CVPR52734.2025.00510

BibTeX

@inproceedings{zhang2025cvpr-closedloop,
  title     = {{Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models}},
  author    = {Zhang, Zhejun and Karkus, Peter and Igl, Maximilian and Ding, Wenhao and Chen, Yuxiao and Ivanovic, Boris and Pavone, Marco},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {5422-5432},
  doi       = {10.1109/CVPR52734.2025.00510},
  url       = {https://mlanthology.org/cvpr/2025/zhang2025cvpr-closedloop/}
}