Monte-Carlo Search for an Equilibrium in Dec-POMDPs

Abstract

Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability. Seeking a global optimum is difficult (NEXP complete), but seeking a Nash equilibrium - each agent policy being a best response to the other agents - is more accessible, and allowed addressing infinite-horizon problems with solutions in the form of finite state controllers. In this paper, we show that this approach can be adapted to cases where only a generative model (a simulator) of the Dec-POMDP is available. This requires relying on a simulation-based POMDP solver to construct an agent’s FSC node by node. A related process is used to heuristically derive initial FSCs. Experiment with benchmarks shows that MC-JESP is competitive with existing Dec-POMDP solvers, even better than many offline methods using explicit models.

Cite

Text

You et al. "Monte-Carlo Search for an Equilibrium in Dec-POMDPs." Uncertainty in Artificial Intelligence, 2023.

Markdown

[You et al. "Monte-Carlo Search for an Equilibrium in Dec-POMDPs." Uncertainty in Artificial Intelligence, 2023.](https://mlanthology.org/uai/2023/you2023uai-montecarlo/)

BibTeX

@inproceedings{you2023uai-montecarlo,
  title     = {{Monte-Carlo Search for an Equilibrium in Dec-POMDPs}},
  author    = {You, Yang and Thomas, Vincent and Colas, Francis and Buffet, Olivier},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2023},
  pages     = {2444-2453},
  volume    = {216},
  url       = {https://mlanthology.org/uai/2023/you2023uai-montecarlo/}
}