Wavelet-Driven Spatiotemporal Predictive Learning: Bridging Frequency and Time Variations

Abstract

Spatiotemporal predictive learning is a paradigm that empowers models to learn spatial and temporal patterns by predicting future frames from past frames in an unsupervised manner. This method typically uses recurrent units to capture long-term dependencies, but these units often come with high computational costs and limited performance in real-world scenes. This paper presents an innovative Wavelet-based SpatioTemporal (WaST) framework, which extracts and adaptively controls both low and high-frequency components at image and feature levels via 3D discrete wavelet transform for faster processing while maintaining high-quality predictions. We propose a Time-Frequency Aware Translator uniquely crafted to efficiently learn short- and long-range spatiotemporal information by individually modeling spatial frequency and temporal variations. Meanwhile, we design a wavelet-domain High-Frequency Focal Loss that effectively supervises high-frequency variations. Extensive experiments across various real-world scenarios, such as driving scene prediction, traffic flow prediction, human motion capture, and weather forecasting, demonstrate that our proposed WaST achieves state-of-the-art performance over various spatiotemporal prediction methods.

Cite

Text

Nie et al. "Wavelet-Driven Spatiotemporal Predictive Learning: Bridging Frequency and Time Variations." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I5.28230

Markdown

[Nie et al. "Wavelet-Driven Spatiotemporal Predictive Learning: Bridging Frequency and Time Variations." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/nie2024aaai-wavelet/) doi:10.1609/AAAI.V38I5.28230

BibTeX

@inproceedings{nie2024aaai-wavelet,
  title     = {{Wavelet-Driven Spatiotemporal Predictive Learning: Bridging Frequency and Time Variations}},
  author    = {Nie, Xuesong and Yan, Yunfeng and Li, Siyuan and Tan, Cheng and Chen, Xi and Jin, Haoyuan and Zhu, Zhihang and Li, Stan Z. and Qi, Donglian},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {4334-4342},
  doi       = {10.1609/AAAI.V38I5.28230},
  url       = {https://mlanthology.org/aaai/2024/nie2024aaai-wavelet/}
}