Distributed Inverse Constrained Reinforcement Learning for Multi-Agent Systems

Abstract

This paper considers the problem of recovering the policies of multiple interacting experts by estimating their reward functions and constraints where the demonstration data of the experts is distributed to a group of learners. We formulate this problem as a distributed bi-level optimization problem and propose a novel bi-level ``distributed inverse constrained reinforcement learning" (D-ICRL) algorithm that allows the learners to collaboratively estimate the constraints in the outer loop and learn the corresponding policies and reward functions in the inner loop from the distributed demonstrations through intermittent communications. We formally guarantee that the distributed learners asymptotically achieve consensus which belongs to the set of stationary points of the bi-level optimization problem.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Liu and Zhu. "Distributed Inverse Constrained Reinforcement Learning for Multi-Agent Systems." Neural Information Processing Systems, 2022.

Markdown

[Liu and Zhu. "Distributed Inverse Constrained Reinforcement Learning for Multi-Agent Systems." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/liu2022neurips-distributed/)

BibTeX

@inproceedings{liu2022neurips-distributed,
  title     = {{Distributed Inverse Constrained Reinforcement Learning for Multi-Agent Systems}},
  author    = {Liu, Shicheng and Zhu, Minghui},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/liu2022neurips-distributed/}
}