Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models

Abstract

Vision-Language-Action (VLA) models for autonomous driving show promise but falter in unstructured corner case scenarios, largely due to a scarcity of targeted benchmarks. To address this, we introduce Impromptu VLA. Our core contribution is the Impromptu VLA Dataset: over 80,000 meticulously curated video clips, distilled from over 2M source clips sourced from 8 open-source large-scale datasets. This dataset is built upon our novel taxonomy of four challenging unstructured categories and features rich, planning-oriented question-answering annotations and action trajectories. Crucially, experiments demonstrate that VLAs trained with our dataset achieve substantial performance gains on established benchmarks—improving closed-loop NeuroNCAP scores and collision rates, and reaching near state-of-the-art L2 accuracy in open-loop nuScenes trajectory prediction. Furthermore, our Q&A suite serves as an effective diagnostic, revealing clear VLM improvements in perception, prediction, and planning. Our code, data and models are available at https://github.com/ahydchh/Impromptu-VLA

Cite

Text

Chi et al. "Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models." Advances in Neural Information Processing Systems, 2025.

Markdown

[Chi et al. "Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/chi2025neurips-impromptu/)

BibTeX

@inproceedings{chi2025neurips-impromptu,
  title     = {{Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action Models}},
  author    = {Chi, Haohan and Gao, Huan-ang and Liu, Ziming and Liu, Jianing and Liu, Chenyu and Li, Jinwei and Yang, Kaisen and Yu, Yangcheng and Wang, Zeda and Li, Wenyi and Wang, Leichen and Hu, Xingtao and Sun, Hao and Zhao, Hang and Zhao, Hao},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/chi2025neurips-impromptu/}
}