Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments

Simão, Thiago D.

doi:10.24963/IJCAI.2019/919

Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments

Thiago D. Simão

IJCAI 2019 pp. 6460-6461

doi:10.24963/IJCAI.2019/919 /ijcai/2019/simao2019ijcai-safe/

Abstract

Reinforcement Learning (RL) deals with problems that can be modeled as a Markov Decision Process (MDP) where the transition function is unknown. In situations where an arbitrary policy pi is already in execution and the experiences with the environment were recorded in a batch D, an RL algorithm can use D to compute a new policy pi'. However, the policy computed by traditional RL algorithms might have worse performance compared to pi. Our goal is to develop safe RL algorithms, where the agent has a high confidence that the performance of pi' is better than the performance of pi given D. To develop sample-efficient and safe RL algorithms we combine ideas from exploration strategies in RL with a safe policy improvement method.

PDF IJCAI Semantic Scholar

Cite

Text

Simão. "Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments." International Joint Conference on Artificial Intelligence, 2019. doi:10.24963/IJCAI.2019/919

Markdown

[Simão. "Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments." International Joint Conference on Artificial Intelligence, 2019.](https://mlanthology.org/ijcai/2019/simao2019ijcai-safe/) doi:10.24963/IJCAI.2019/919

BibTeX

@inproceedings{simao2019ijcai-safe,
  title     = {{Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments}},
  author    = {Simão, Thiago D.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2019},
  pages     = {6460-6461},
  doi       = {10.24963/IJCAI.2019/919},
  url       = {https://mlanthology.org/ijcai/2019/simao2019ijcai-safe/}
}