Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes
Abstract
Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.
Cite
Text
Klaska et al. "Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I18.29993Markdown
[Klaska et al. "Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/klaska2024aaai-optimizing/) doi:10.1609/AAAI.V38I18.29993BibTeX
@inproceedings{klaska2024aaai-optimizing,
title = {{Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes}},
author = {Klaska, David and Kucera, Antonín and Kur, Vojtech and Musil, Vít and Rehák, Vojtech},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {20143-20150},
doi = {10.1609/AAAI.V38I18.29993},
url = {https://mlanthology.org/aaai/2024/klaska2024aaai-optimizing/}
}