Risk Aversion in Markov Decision Processes via near Optimal Chernoff Bounds

Abstract

The expected return is a widely used objective in decision making under uncer- tainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw con- nections to previously proposed objectives for risk-aware planing: minmax, ex- ponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for op- timizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.

Cite

Text

Moldovan and Abbeel. "Risk Aversion in Markov Decision Processes via near Optimal Chernoff Bounds." Neural Information Processing Systems, 2012.

Markdown

[Moldovan and Abbeel. "Risk Aversion in Markov Decision Processes via near Optimal Chernoff Bounds." Neural Information Processing Systems, 2012.](https://mlanthology.org/neurips/2012/moldovan2012neurips-risk/)

BibTeX

@inproceedings{moldovan2012neurips-risk,
  title     = {{Risk Aversion in Markov Decision Processes via near Optimal Chernoff Bounds}},
  author    = {Moldovan, Teodor M. and Abbeel, Pieter},
  booktitle = {Neural Information Processing Systems},
  year      = {2012},
  pages     = {3131-3139},
  url       = {https://mlanthology.org/neurips/2012/moldovan2012neurips-risk/}
}