Fast Algorithms for $L_\infty$-Constrained S-Rectangular Robust MDPs

Abstract

Robust Markov decision processes (RMDPs) are a useful building block of robust reinforcement learning algorithms but can be hard to solve. This paper proposes a fast, exact algorithm for computing the Bellman operator for S-rectangular robust Markov decision processes with $L_\infty$-constrained rectangular ambiguity sets. The algorithm combines a novel homotopy continuation method with a bisection method to solve S-rectangular ambiguity in quasi-linear time in the number of states and actions. The algorithm improves on the cubic time required by leading general linear programming methods. Our experimental results confirm the practical viability of our method and show that it outperforms a leading commercial optimization package by several orders of magnitude.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Behzadian et al. "Fast Algorithms for $L_\infty$-Constrained S-Rectangular Robust MDPs." Neural Information Processing Systems, 2021.

Markdown

[Behzadian et al. "Fast Algorithms for $L_\infty$-Constrained S-Rectangular Robust MDPs." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/behzadian2021neurips-fast/)

BibTeX

@inproceedings{behzadian2021neurips-fast,
  title     = {{Fast Algorithms for $L_\infty$-Constrained S-Rectangular Robust MDPs}},
  author    = {Behzadian, Bahram and Petrik, Marek and Ho, Chin Pang},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/behzadian2021neurips-fast/}
}