Multimodal Transformer Networks for Pedestrian Trajectory Prediction

Yin, Ziyi; Liu, Ruijin; Xiong, Zhiliang; Yuan, Zejian

doi:10.24963/IJCAI.2021/174

Multimodal Transformer Networks for Pedestrian Trajectory Prediction

Ziyi Yin, Ruijin Liu, Zhiliang Xiong, Zejian Yuan

IJCAI 2021 pp. 1259-1265

doi:10.24963/IJCAI.2021/174 /ijcai/2021/yin2021ijcai-multimodal/

Abstract

We consider the problem of forecasting the future locations of pedestrians in an ego-centric view of a moving vehicle. Current CNNs or RNNs are flawed in capturing the high dynamics of motion between pedestrians and the ego-vehicle, and suffer from the massive parameter usages due to the inefficiency of learning long-term temporal dependencies. To address these issues, we propose an efficient multimodal transformer network that aggregates the trajectory and ego-vehicle speed variations at a coarse granularity and interacts with the optical flow in a fine-grained level to fill the vacancy of highly dynamic motion. Specifically, a coarse-grained fusion stage fuses the information between trajectory and ego-vehicle speed modalities to capture the general temporal consistency. Meanwhile, a fine-grained fusion stage merges the optical flow in the center area and pedestrian area, which compensates the highly dynamic motion of ego-vehicle and target pedestrian. Besides, the whole network is only attention-based that can efficiently model long-term sequences for better capturing the temporal variations. Our multimodal transformer is validated on the PIE and JAAD datasets and achieves state-of-the-art performance with the most light-weight model size. The codes are available at https://github.com/ericyinyzy/MTN_trajectory.

PDF IJCAI Semantic Scholar

Cite

Text

Yin et al. "Multimodal Transformer Networks for Pedestrian Trajectory Prediction." International Joint Conference on Artificial Intelligence, 2021. doi:10.24963/IJCAI.2021/174

Markdown

[Yin et al. "Multimodal Transformer Networks for Pedestrian Trajectory Prediction." International Joint Conference on Artificial Intelligence, 2021.](https://mlanthology.org/ijcai/2021/yin2021ijcai-multimodal/) doi:10.24963/IJCAI.2021/174

BibTeX

@inproceedings{yin2021ijcai-multimodal,
  title     = {{Multimodal Transformer Networks for Pedestrian Trajectory Prediction}},
  author    = {Yin, Ziyi and Liu, Ruijin and Xiong, Zhiliang and Yuan, Zejian},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {1259-1265},
  doi       = {10.24963/IJCAI.2021/174},
  url       = {https://mlanthology.org/ijcai/2021/yin2021ijcai-multimodal/}
}