Planning with Participation Constraints

Abstract

We pose and study the problem of planning in Markov decision processes (MDPs), subject to participation constraints as studied in mechanism design. In this problem, a planner must work with a self-interested agent on a given MDP. Each action in the MDP provides an immediate reward to the planner and a (possibly different) reward to the agent. The agent has no control in choosing the actions, but has the option to end the entire process at any time. The goal of the planner is to find a policy that maximizes her cumulative reward, taking into consideration the agent's ability to terminate. We give a fully polynomial-time approximation scheme for this problem. En route, we present polynomial-time algorithms for computing (exact) optimal policies for important special cases of this problem, including when the time horizon is constant, or when the MDP exhibits a "definitive decisions" property. We illustrate our algorithms with two different game-theoretic applications: the problem of assigning rides in ride-sharing and the problem of designing screening policies. Our results imply efficient algorithms for computing (approximately) optimal policies in both applications.

Cite

Text

Zhang et al. "Planning with Participation Constraints." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I5.20462

Markdown

[Zhang et al. "Planning with Participation Constraints." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/zhang2022aaai-planning/) doi:10.1609/AAAI.V36I5.20462

BibTeX

@inproceedings{zhang2022aaai-planning,
  title     = {{Planning with Participation Constraints}},
  author    = {Zhang, Hanrui and Cheng, Yu and Conitzer, Vincent},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {5260-5267},
  doi       = {10.1609/AAAI.V36I5.20462},
  url       = {https://mlanthology.org/aaai/2022/zhang2022aaai-planning/}
}