Team-Imitate-Synchronize for Solving Dec-POMDPs

Abdoo, Eliran; Brafman, Ronen I.; Shani, Guy; Soffair, Nitsan

doi:10.1007/978-3-031-26412-2_14

Team-Imitate-Synchronize for Solving Dec-POMDPs

Eliran Abdoo, Ronen I. Brafman, Guy Shani, Nitsan Soffair

ECML-PKDD 2022 pp. 216-232

doi:10.1007/978-3-031-26412-2_14 /ecmlpkdd/2022/abdoo2022ecmlpkdd-teamimitatesynchronize/

Abstract

Multi-agent collaboration under partial observability is a difficult task. Multi-agent reinforcement learning (MARL) algorithms that do not leverage a model of the environment struggle with tasks that require sequences of collaborative actions, while Dec-POMDP algorithms that use such models to compute near-optimal policies, scale poorly. In this paper, we suggest the Team-Imitate-Synchronize (TIS) approach, a heuristic, model-based method for solving such problems. Our approach begins by solving the joint team problem, assuming that observations are shared. Then, for each agent we solve a single agent problem designed to imitate its behavior within the team plan. Finally, we adjust the single agent policies for better synchronization. Our experiments demonstrate that our method provides comparable solutions to Dec-POMDP solvers over small problems, while scaling to much larger problems, and provides collaborative plans that MARL algorithms are unable to identify.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Abdoo et al. "Team-Imitate-Synchronize for Solving Dec-POMDPs." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022. doi:10.1007/978-3-031-26412-2_14

Markdown

[Abdoo et al. "Team-Imitate-Synchronize for Solving Dec-POMDPs." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2022.](https://mlanthology.org/ecmlpkdd/2022/abdoo2022ecmlpkdd-teamimitatesynchronize/) doi:10.1007/978-3-031-26412-2_14

BibTeX

@inproceedings{abdoo2022ecmlpkdd-teamimitatesynchronize,
  title     = {{Team-Imitate-Synchronize for Solving Dec-POMDPs}},
  author    = {Abdoo, Eliran and Brafman, Ronen I. and Shani, Guy and Soffair, Nitsan},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2022},
  pages     = {216-232},
  doi       = {10.1007/978-3-031-26412-2_14},
  url       = {https://mlanthology.org/ecmlpkdd/2022/abdoo2022ecmlpkdd-teamimitatesynchronize/}
}