ST-PPO: A Spatio-Temporal Attention Enhanced Proximal Policy Optimization Algorithm for Autonomous Driving in Complex Traffic Scenarios

Da, Cheng; Qian, Yongsheng; Zeng, Junwei; Wei, Xunting; Zhang, Futao

doi:10.1007/S10994-025-06887-X

ST-PPO: A Spatio-Temporal Attention Enhanced Proximal Policy Optimization Algorithm for Autonomous Driving in Complex Traffic Scenarios

Cheng Da, Yongsheng Qian, Junwei Zeng, Xunting Wei, Futao Zhang

MLJ 2025 pp. 245

doi:10.1007/S10994-025-06887-X /mlj/2025/da2025mlj-stppo/

Abstract

Autonomous driving in complex traffic environments poses significant challenges due to the dynamic nature of multi-agent interactions and varying road conditions. This paper proposes ST-PPO, a novel reinforcement learning framework that integrates spatio-temporal attention mechanisms with Proximal Policy Optimization for autonomous vehicle control. The framework addresses three critical challenges: spatial feature extraction from complex traffic scenes, temporal dependency modeling of vehicle behaviors, and adaptive policy learning in dynamic environments. The spatial attention module captures crucial spatial relationships between traffic participants, while the temporal attention module models the sequential dependencies of driving behaviors. We evaluate our approach on challenging scenarios including merging areas, continuous curves, intersections, and adverse weather conditions. Comprehensive experiments demonstrate that ST-PPO significantly outperforms baseline methods, achieving 25.2% improvement in training efficiency and 7.6% increase in overall performance compared to vanilla PPO. The method demonstrates remarkable stability with 18.7% lower KL divergence and higher value estimation accuracy (0.8 explained variance). Ablation studies further validate the effectiveness of both spatial and temporal attention components. Our method shows particular strength in handling complex scenarios such as dense traffic and adverse weather conditions, where it maintains stable performance while baseline methods deteriorate significantly. The proposed approach represents a significant step toward robust autonomous driving systems capable of handling real-world traffic complexity.

PDF MLJ Semantic Scholar

Cite

Text

Da et al. "ST-PPO: A Spatio-Temporal Attention Enhanced Proximal Policy Optimization Algorithm for Autonomous Driving in Complex Traffic Scenarios." Machine Learning, 2025. doi:10.1007/S10994-025-06887-X

Markdown

[Da et al. "ST-PPO: A Spatio-Temporal Attention Enhanced Proximal Policy Optimization Algorithm for Autonomous Driving in Complex Traffic Scenarios." Machine Learning, 2025.](https://mlanthology.org/mlj/2025/da2025mlj-stppo/) doi:10.1007/S10994-025-06887-X

BibTeX

@article{da2025mlj-stppo,
  title     = {{ST-PPO: A Spatio-Temporal Attention Enhanced Proximal Policy Optimization Algorithm for Autonomous Driving in Complex Traffic Scenarios}},
  author    = {Da, Cheng and Qian, Yongsheng and Zeng, Junwei and Wei, Xunting and Zhang, Futao},
  journal   = {Machine Learning},
  year      = {2025},
  pages     = {245},
  doi       = {10.1007/S10994-025-06887-X},
  volume    = {114},
  url       = {https://mlanthology.org/mlj/2025/da2025mlj-stppo/}
}