Backward Monte Carlo Tree Search: Charting Unsafe Regions in the Belief-Space

Abstract

Safety-critical systems often operate in partially observable environments, where assessing the safety of the underlying policy remains a fundamental challenge. This study focuses on evaluating policies by identifying regions of the belief-space that can lead the system’s policy to an undesirable state with a non-negligible probability. In this paper, we introduce Backward Monte Carlo Tree Search, the first Monte Carlo tree search framework that expands backward in time within the belief-space. The tree search begins from an undesired terminal belief and recursively explores its possible predecessors, constructing a tree of belief transitions that could lead to an unsafe outcome within a given horizon. Evaluations in gridworld and autonomous driving domains show that identifying beliefs from which failures may occur enables runtime risk forecasting and targeted policy retraining, marking a conceptual shift in how safety is validated under uncertainty.

Cite

Text

Yildiz et al. "Backward Monte Carlo Tree Search: Charting Unsafe Regions in the Belief-Space." Journal of Artificial Intelligence Research, 2026. doi:10.1613/JAIR.1.18011

Markdown

[Yildiz et al. "Backward Monte Carlo Tree Search: Charting Unsafe Regions in the Belief-Space." Journal of Artificial Intelligence Research, 2026.](https://mlanthology.org/jair/2026/yildiz2026jair-backward/) doi:10.1613/JAIR.1.18011

BibTeX

@article{yildiz2026jair-backward,
  title     = {{Backward Monte Carlo Tree Search: Charting Unsafe Regions in the Belief-Space}},
  author    = {Yildiz, Anil and Yel, Esen and Vazquez-Chanlatte, Marcell and Wray, Kyle Hollins and Kochenderfer, Mykel J. and Witwicki, Stefan J.},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2026},
  doi       = {10.1613/JAIR.1.18011},
  volume    = {85},
  url       = {https://mlanthology.org/jair/2026/yildiz2026jair-backward/}
}