Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion

Abstract

Understanding movements in multi-agent scenarios is a fundamental problem in intelligent systems. Previous research assumes complete and synchronized observations. However, real-world partial observation caused by occlusions leads to inevitable model failure, which demands a unified framework for coexisting trajectory prediction, imputation, and recovery. Unlike previous attempts that handled observed and unobserved behaviors in a coupled manner, we explore a decoupled denoising diffusion modeling paradigm with a unidirectional information valve to separate the interference from uncertain behaviors. Building on this, we proposed a Unified Masked Trajectory Diffusion model (UniMTD) for arbitrary levels of missing observations. We designed a unidirectional attention as a valve unit to control the direction of information flow between the observed and masked areas, gradually refining the missing observations toward a real-world distribution. We construct it into a unidirectional MoE structure to handle varying proportions of missing observations. A Cached Diffusion model is further designed to improve generation quality while reducing computation and time overhead. Our method has achieved a great leap across human motions and vehicle traffic. UniMTD efficiently achieves 65% improvement in minADE20 and reaches SOTA with advantages of 98%, 50%, 73%, and 29% across 4 fidelity metrics on out-of-boundary, velocity, and trajectory length. Our code will be released here.

Cite

Text

Yang et al. "Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion." International Conference on Computer Vision, 2025.

Markdown

[Yang et al. "Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/yang2025iccv-unified/)

BibTeX

@inproceedings{yang2025iccv-unified,
  title     = {{Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion}},
  author    = {Yang, Songru and Shi, Zhenwei and Zou, Zhengxia},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {27563-27574},
  url       = {https://mlanthology.org/iccv/2025/yang2025iccv-unified/}
}