Learning NP-Hard Multi-Agent Assignment Planning Using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-Learning

Abstract

This paper explores the possibility of near-optimally solving multi-agent, multi-task NP-hard planning problems with time-dependent rewards using a learning-based algorithm. In particular, we consider a class of robot/machine scheduling problems called the multi-robot reward collection problem (MRRC). Such MRRC problems well model ride-sharing, pickup-and-delivery, and a variety of related problems. In representing the MRRC problem as a sequential decision-making problem, we observe that each state can be represented as an extension of probabilistic graphical models (PGMs), which we refer to as random PGMs. We then develop a mean-field inference method for random PGMs. We then propose (1) an order-transferable Q-function estimator and (2) an order-transferability-enabled auction to select a joint assignment in polynomial-time. These result in a reinforcement learning framework with at least $1-1/e$ optimality. Experimental results on solving MRRC problems highlight the near-optimality and transferability of the proposed methods. We also consider identical parallel machine scheduling problems (IPMS) and minimax multiple traveling salesman problems (minimax-mTSP).

Cite

Text

Kang et al. "Learning NP-Hard Multi-Agent Assignment Planning Using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-Learning." Neural Information Processing Systems, 2022.

Markdown

[Kang et al. "Learning NP-Hard Multi-Agent Assignment Planning Using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-Learning." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/kang2022neurips-learning/)

BibTeX

@inproceedings{kang2022neurips-learning,
  title     = {{Learning NP-Hard Multi-Agent Assignment Planning Using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-Learning}},
  author    = {Kang, Hyunwook and Kwon, Taehwan and Park, Jinkyoo and Morrison, James R.},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/kang2022neurips-learning/}
}