Robust Adaptive Multi-Step Predictive Shielding

Abstract

Reinforcement learning for safety-critical tasks requires policies that are both high-performing and safe throughout the learning process. While model-predictive shielding is a promising approach, existing methods are often computationally intractable for the high-dimensional, nonlinear systems where deep RL excels, as they typically rely on a patchwork of local models. We introduce **RAMPS**, a scalable shielding framework that overcomes this limitation by leveraging a learned, linear representation of the environment's dynamics. This model can range from a linear regression in the original state space to a more complex operator learned in a high-dimensional feature space. The key is that this linear structure enables a robust, look-ahead safety technique based on a *multi-step Control Barrier Function (CBF)*. By moving beyond myopic one-step formulations, **RAMPS** accounts for model error and control delays to provide reliable, real-time interventions. The resulting framework is minimally invasive, computationally efficient, and built upon robust control-theoretic foundations. Our experiments demonstrate that **RAMPS** significantly reduces safety violations compared to existing safe RL methods while maintaining high task performance in complex control environments.

Cite

Text

Ambadkar et al. "Robust Adaptive Multi-Step Predictive Shielding." International Conference on Learning Representations, 2026.

Markdown

[Ambadkar et al. "Robust Adaptive Multi-Step Predictive Shielding." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/ambadkar2026iclr-robust/)

BibTeX

@inproceedings{ambadkar2026iclr-robust,
  title     = {{Robust Adaptive Multi-Step Predictive Shielding}},
  author    = {Ambadkar, Tanmay and Chudiwal, Darshan and Anderson, Greg and Verma, Abhinav},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/ambadkar2026iclr-robust/}
}