Reinforcement Learning for MDPs with Constraints

Geibel, Peter

doi:10.1007/11871842_63

Reinforcement Learning for MDPs with Constraints

Peter Geibel

ECML-PKDD 2006 pp. 646-653

doi:10.1007/11871842_63 /ecmlpkdd/2006/geibel2006ecml-reinforcement/

Abstract

In this article, I will consider Markov Decision Processes with two criteria, each defined as the expected value of an infinite horizon cumulative return. The second criterion is either itself subject to an inequality constraint, or there is maximum allowable probability that the single returns violate the constraint. I describe and discuss three new reinforcement learning approaches for solving such control problems.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Geibel. "Reinforcement Learning for MDPs with Constraints." European Conference on Machine Learning, 2006. doi:10.1007/11871842_63

Markdown

[Geibel. "Reinforcement Learning for MDPs with Constraints." European Conference on Machine Learning, 2006.](https://mlanthology.org/ecmlpkdd/2006/geibel2006ecml-reinforcement/) doi:10.1007/11871842_63

BibTeX

@inproceedings{geibel2006ecml-reinforcement,
  title     = {{Reinforcement Learning for MDPs with Constraints}},
  author    = {Geibel, Peter},
  booktitle = {European Conference on Machine Learning},
  year      = {2006},
  pages     = {646-653},
  doi       = {10.1007/11871842_63},
  url       = {https://mlanthology.org/ecmlpkdd/2006/geibel2006ecml-reinforcement/}
}