End-to-End Driving with Online Trajectory Evaluation via BEV World Model
Abstract
End-to-end autonomous driving has achieved remarkable progress by integrating perception, prediction, and planning into a fully differentiable framework. Yet, to fully realize its potential, an effective online trajectory evaluation is indispensable to ensure safety. By forecasting the future outcomes of a given trajectory, trajectory evaluation becomes much more effective. This goal can be achieved by employing a world model to capture environmental dynamics and predict future states. Therefore, we propose an end-to-end driving framework **WoTE**, which leverages a BEV **Wo**rld model to predict future BEV states for **T**rajectory **E**valuation. The proposed BEV world model is latency-efficient compared to image-level world models and can be seamlessly supervised using off-the-shelf BEV-space traffic simulators. We validate our framework on both the NAVSIM benchmark and the closed-loop Bench2Drive benchmark based on the CARLA simulator, achieving state-of-the-art performance. Code is released at https://github.com/liyingyanUCAS/WoTE.
Cite
Text
Li et al. "End-to-End Driving with Online Trajectory Evaluation via BEV World Model." International Conference on Computer Vision, 2025.Markdown
[Li et al. "End-to-End Driving with Online Trajectory Evaluation via BEV World Model." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/li2025iccv-endtoend/)BibTeX
@inproceedings{li2025iccv-endtoend,
title = {{End-to-End Driving with Online Trajectory Evaluation via BEV World Model}},
author = {Li, Yingyan and Wang, Yuqi and Liu, Yang and He, Jiawei and Fan, Lue and Zhang, Zhaoxiang},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {27137-27146},
url = {https://mlanthology.org/iccv/2025/li2025iccv-endtoend/}
}