Efficient Multi-Agent Offline Coordination via Diffusion-Based Trajectory Stitching

Abstract

Learning from offline data without interacting with the environment is a promising way to fully leverage the intelligent decision-making capabilities of multi-agent reinforcement learning (MARL). Previous approaches have primarily focused on developing learning techniques, such as conservative methods tailored to MARL using limited offline data. However, these methods often overlook the temporal relationships across different timesteps and spatial relationships between teammates, resulting in low learning efficiency in imbalanced data scenarios. To comprehensively explore the data structure of MARL and enhance learning efficiency, we propose Multi-Agent offline coordination via Diffusion-based Trajectory Stitching (MADiTS), a novel diffusion-based data augmentation pipeline that systematically generates trajectories by stitching high-quality coordination segments together. MADiTS first generates trajectory segments using a trained diffusion model, followed by applying a bidirectional dynamics constraint to ensure that the trajectories align with environmental dynamics. Additionally, we develop an offline credit assignment technique to identify and optimize the behavior of underperforming agents in the generated segments. This iterative procedure continues until a satisfactory augmented episode trajectory is generated within the predefined limit or is discarded otherwise. Empirical results on imbalanced datasets of multiple benchmarks demonstrate that MADiTS significantly improves MARL performance.

Cite

Text

Yuan et al. "Efficient Multi-Agent Offline Coordination via Diffusion-Based Trajectory Stitching." International Conference on Learning Representations, 2025.

Markdown

[Yuan et al. "Efficient Multi-Agent Offline Coordination via Diffusion-Based Trajectory Stitching." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/yuan2025iclr-efficient/)

BibTeX

@inproceedings{yuan2025iclr-efficient,
  title     = {{Efficient Multi-Agent Offline Coordination via Diffusion-Based Trajectory Stitching}},
  author    = {Yuan, Lei and Bian, Yuqi and Li, Lihe and Zhang, Ziqian and Guan, Cong and Yu, Yang},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/yuan2025iclr-efficient/}
}