Generating Robot Constitutions & Benchmarks for Semantic Safety

Abstract

Large vision and language models are being increasingly deployed on real robots, leading to an immediate need for ensuring robot safety under AI-control. In this paper, we develop the ASIMOV Benchmark — a collection of large-scale semantic safety datasets grounded in real-world visual scenes and human injury reports from hospitals (500k situations, 3M instructions). We propose a scalable recipe for data generation leveraging text and image generation techniques to synthesize safety-relevant scenarios. As a second contribution, we develop a framework to automatically generate robot constitutions from real-world data to steer a robot’s behavior using Constitutional AI mechanisms. We report a top alignment rate of 84.3% on the ASIMOV Benchmark using generated constitutions, outperforming no-constitution baselines and human-written constitutions. We argue that human interpretability and modifiability of constitutions inferred from data make them an ideal medium for behavior governance of AI-controlled robots.

Cite

Text

Sermanet et al. "Generating Robot Constitutions & Benchmarks for Semantic Safety." Proceedings of The 9th Conference on Robot Learning, 2025.

Markdown

[Sermanet et al. "Generating Robot Constitutions & Benchmarks for Semantic Safety." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/sermanet2025corl-generating/)

BibTeX

@inproceedings{sermanet2025corl-generating,
  title     = {{Generating Robot Constitutions & Benchmarks for Semantic Safety}},
  author    = {Sermanet, Pierre and Majumdar, Anirudha and Irpan, Alex and Kalashnikov, Dmitry and Sindhwani, Vikas},
  booktitle = {Proceedings of The 9th Conference on Robot Learning},
  year      = {2025},
  pages     = {4767-4823},
  volume    = {305},
  url       = {https://mlanthology.org/corl/2025/sermanet2025corl-generating/}
}