Certified Policy Verification and Synthesis for MDPs Under Distributional Reach-Avoidance Properties

Abstract

A key problem in the design of normative multi-agent systems is the cost of enforcing a norm (for the system operator) or complying with the norm (for the system users). If the cost is too high, ensuring compliant behavior may be uneconomic, or users may be deterred from participating in the MAS. In this paper, we consider the problem of synthesizing minimum cost dynamic norms to satisfy a system-level objective specified in Alternating Time Temporal Logic with Strategy Contexts (ATLsc∗). We show that synthesizing a dynamic norm under a bound on the cost of any prohibited set of actions has the same complexity as synthesizing arbitrary norms. We also show that synthesizing norms that minimize the average cost of the prohibited set of actions is unsolvable; however, synthesizing ε-optimal norms is possible.

Cite

Text

Akshay et al. "Certified Policy Verification and Synthesis for MDPs Under Distributional Reach-Avoidance Properties." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/1

Markdown

[Akshay et al. "Certified Policy Verification and Synthesis for MDPs Under Distributional Reach-Avoidance Properties." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/akshay2024ijcai-certified/) doi:10.24963/ijcai.2024/1

BibTeX

@inproceedings{akshay2024ijcai-certified,
  title     = {{Certified Policy Verification and Synthesis for MDPs Under Distributional Reach-Avoidance Properties}},
  author    = {Akshay, S. and Chatterjee, Krishnendu and Meggendorfer, Tobias and Zikelic, Dorde},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {3-12},
  doi       = {10.24963/ijcai.2024/1},
  url       = {https://mlanthology.org/ijcai/2024/akshay2024ijcai-certified/}
}