Generating Diverse Cooperative Agents by Learning Incompatible Policies
Abstract
Effectively training a robust agent that can cooperate with unseen agents requires diverse training partner agents. Nonetheless, obtaining cooperative agents with diverse behaviors is a challenging task. Previous work proposes learning a diverse set of agents by diversifying the state-action distribution of the agents. However, without information about the task's goal, the diversified behaviors are not motivated to find other important, albeit non-optimal, solutions, resulting in only local variations of a solution. In this work, we propose to learn diverse behaviors by looking at policy compatibility while using state-action information to induce local variations of behaviors. Conceptually, policy compatibility measures whether policies of interest can collectively solve a task. We posit that incompatible policies can be behaviorally different. Based on this idea, we propose a novel objective to learn diverse behaviors. We theoretically show that our novel objective can generate a dissimilar policy, which we incorporate into a population-based training scheme. Empirically, the proposed method outperforms the baselines in terms of the number of discovered solutions given the same number of agents.
Cite
Text
Charakorn et al. "Generating Diverse Cooperative Agents by Learning Incompatible Policies." ICML 2022 Workshops: AI4ABM, 2022.Markdown
[Charakorn et al. "Generating Diverse Cooperative Agents by Learning Incompatible Policies." ICML 2022 Workshops: AI4ABM, 2022.](https://mlanthology.org/icmlw/2022/charakorn2022icmlw-generating/)BibTeX
@inproceedings{charakorn2022icmlw-generating,
title = {{Generating Diverse Cooperative Agents by Learning Incompatible Policies}},
author = {Charakorn, Rujikorn and Manoonpong, Poramate and Dilokthanakul, Nat},
booktitle = {ICML 2022 Workshops: AI4ABM},
year = {2022},
url = {https://mlanthology.org/icmlw/2022/charakorn2022icmlw-generating/}
}