Acting Beyond Learning: Imagination-Assisted Decision-Making in the Visual-Based Multi-Agent Cooperative Scenarios

Abstract

Learning optimal policies in multi-agent cooperative settings with visual observations is significant and challenging. Agents must first perform state representation learning for their image observations and then learn policies in the abstracted state space. Aiming at this problem, we propose a novel model-based MARL method named Contrastive Latent World for Policy Optimization (CLWPO). In CLWPO, we first design a state representation model to facilitate learning in the latent state space. With the support of this model, we construct the latent world and introduce a contrastive variational bound (CVB) to optimize it. Subsequently, we develop a heuristic policy optimization (HPO) scheme, incorporating model-free learning with model-based planning to obtain robust policies that predict future behaviors. In particular, in the planning, we maintain a queue of teammate models and calculate an adaptive rollout length for each agent to support their self-imagination and reduce the model-based return discrepancy. Finally, we conducted extensive experiments in the PettingZoo benchmark, and results show that CLWPO significantly enhances learning efficiency and improves agent performance compared to state-of-the-art MARL methods.

Cite

Text

Yang et al. "Acting Beyond Learning: Imagination-Assisted Decision-Making in the Visual-Based Multi-Agent Cooperative Scenarios." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I20.35502

Markdown

[Yang et al. "Acting Beyond Learning: Imagination-Assisted Decision-Making in the Visual-Based Multi-Agent Cooperative Scenarios." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/yang2025aaai-acting/) doi:10.1609/AAAI.V39I20.35502

BibTeX

@inproceedings{yang2025aaai-acting,
  title     = {{Acting Beyond Learning: Imagination-Assisted Decision-Making in the Visual-Based Multi-Agent Cooperative Scenarios}},
  author    = {Yang, Huanhuan and Shi, Dianxi and Jin, Songchang and Xie, Guojun and Chen, Yang and Qiu, Chunping and Yang, Shaowu},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {21947-21955},
  doi       = {10.1609/AAAI.V39I20.35502},
  url       = {https://mlanthology.org/aaai/2025/yang2025aaai-acting/}
}