Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction
Abstract
Occupancy networks aim to reconstruct the surroundings with occupied semantic voxels. However, frequent object occlusions often occur in dynamic real-world scenarios, which cannot be captured by independent frames. Most existing occupancy networks generate results without explicitly considering past occupancy states and continuous visual changes over time, limiting their temporal accuracy. We tackle it by treating the task from a new continuous updating perspective, which considers historical data and continuous motion clues. We propose a new approach termed Continuous Motion clue exploitation for Occupancy Prediction (CMOP), which incorporates three key designs: (i) Propagator: which forecasts future occupancy states based on historical data; (ii) Tracker: which updates the occupancy on a per-frame basis using dynamic visual motion information; and (iii) Fuser: which aggregates results from the Propagator and Tracker into more robust and accurate occupancy results. Experiments on several benchmarks demonstrate that CMOP outperforms state-of-the-art baselines.
Cite
Text
Xu et al. "Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I8.32958Markdown
[Xu et al. "Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/xu2025aaai-exploiting/) doi:10.1609/AAAI.V39I8.32958BibTeX
@inproceedings{xu2025aaai-exploiting,
title = {{Exploiting Continuous Motion Clues for Vision-Based Occupancy Prediction}},
author = {Xu, Haoran and Peng, Peixi and Zhang, Xinyi and Tan, Guang and Li, Yaokun and Wang, Shuaixian and Li, Luntong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {8860-8868},
doi = {10.1609/AAAI.V39I8.32958},
url = {https://mlanthology.org/aaai/2025/xu2025aaai-exploiting/}
}