Policy Learning with Constraints in Model-Free Reinforcement Learning: A Survey

IJCAI 2021 pp. 4508-4515

doi:10.24963/IJCAI.2021/614 /ijcai/2021/liu2021ijcai-policy/

Abstract

Reinforcement Learning (RL) algorithms have had tremendous success in simulated domains. These algorithms, however, often cannot be directly applied to physical systems, especially in cases where there are constraints to satisfy (e.g. to ensure safety or limit resource consumption). In standard RL, the agent is incentivized to explore any policy with the sole goal of maximizing reward; in the real world, however, ensuring satisfaction of certain constraints in the process is also necessary and essential. In this article, we overview existing approaches addressing constraints in model-free reinforcement learning. We model the problem of learning with constraints as a Constrained Markov Decision Process and consider two main types of constraints: cumulative and instantaneous. We summarize existing approaches and discuss their pros and cons. To evaluate policy performance under constraints, we introduce a set of standard benchmarks and metrics. We also summarize limitations of current methods and present open questions for future research.

PDF IJCAI Semantic Scholar

Cite

Text

Liu et al. "Policy Learning with Constraints in Model-Free Reinforcement Learning: A Survey." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/614

Markdown

[Liu et al. "Policy Learning with Constraints in Model-Free Reinforcement Learning: A Survey." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/liu2021ijcai-policy/) doi:10.24963/IJCAI.2021/614

BibTeX

@inproceedings{liu2021ijcai-policy,
  title     = {{Policy Learning with Constraints in Model-Free Reinforcement Learning: A Survey}},
  author    = {Liu, Yongshuai and Halev, Avishai and Liu, Xin},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {4508-4515},
  doi       = {10.24963/IJCAI.2021/614},
  url       = {https://mlanthology.org/ijcai/2021/liu2021ijcai-policy/}
}