HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems
Abstract
We address the problem of safe policy learning in multi-agent safety-critical autonomous systems. In such systems, it is necessary for each agent to meet the safety requirements at all times while also cooperating with other agents to accomplish the task. Toward this end, we propose a safe Hierarchical Multi-Agent Reinforcement Learning (HMARL) approach based on Control Barrier Functions (CBFs). Our proposed hierarchical approach decomposes the overall reinforcement learning problem into two levels –- learning joint cooperative behavior at the higher level and learning safe individual behavior at the lower or agent level conditioned on the high-level policy. Specifically, we propose a skill-based HMARL-CBF algorithm in which the higher-level problem involves learning a joint policy over the skills for all the agents and the lower-level problem involves learning policies to execute the skills safely with CBFs. We validate our approach on challenging environment scenarios whereby a large number of agents have to safely navigate through conflicting road networks. Compared with existing state-of-the-art methods, our approach significantly improves the safety achieving near perfect (within $5\%$) success/safety rate while also improving performance across all the environments.
Cite
Text
Ahmad et al. "HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems." Advances in Neural Information Processing Systems, 2025.Markdown
[Ahmad et al. "HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/ahmad2025neurips-hmarlcbf/)BibTeX
@inproceedings{ahmad2025neurips-hmarlcbf,
title = {{HMARL-CBF – Hierarchical Multi-Agent Reinforcement Learning with Control Barrier Functions for Safety-Critical Autonomous Systems}},
author = {Ahmad, H M Sabbir and Sabouni, Ehsan and Wasilkoff, Alexander and Budhraja, Param and Guo, Zijian and Zhang, Songyuan and Fan, Chuchu and Cassandras, Christos and Li, Wenchao},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/ahmad2025neurips-hmarlcbf/}
}