MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game
Abstract
Ensuring the safety of high-speed agent in dynamic adversarial environments, such as pursuit-evasion games with target-purchase and obstacle-avoidance, is a significant challenge. Existing reinforcement learning methods often fail to balance safety and reward under strict safety constraints and diverse environmental conditions. To address these limitations, this paper proposes a novel zero-constraint-violation recovery RL framework tailored for high-speed uav pursuit-evasion combat games. The framework includes three key innovations. (1) An extendable multi-step reach-avoid theory: we provide a zero-constraint-violation safety guarantee for multi-strategy reinforcement learning and enabling early danger detection in high speed game. (2) A masked-attention recovery strategy: we introduce a padding-mask attention architecture to handle spatiotemporal variations in dynamic obstacles with varying threat levels. (3) Experimental validation: we validate the framework in obstacle-rich pursuit-evasion scenarios, demonstrating its superiority through comparison with other algorithm and ablation studies. Our approach also shows potential for extension to other rapid-motion tasks and more complex hazardous scenarios. Details and code could be found at https://msmar-rl.github.io.
Cite
Text
Zhao et al. "MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/36Markdown
[Zhao et al. "MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/zhao2025ijcai-msmar/) doi:10.24963/IJCAI.2025/36BibTeX
@inproceedings{zhao2025ijcai-msmar,
title = {{MSMAR-RL: Multi-Step Masked-Attention Recovery Reinforcement Learning for Safe Maneuver Decision in High-Speed Pursuit-Evasion Game}},
author = {Zhao, Yang and Zhao, Wenzhe and Li, Xuelong},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {311-319},
doi = {10.24963/IJCAI.2025/36},
url = {https://mlanthology.org/ijcai/2025/zhao2025ijcai-msmar/}
}