UnO: Unsupervised Occupancy Fields for Perception and Forecasting

Abstract

Perceiving the world and forecasting its future state is a critical task for self-driving. Supervised approaches leverage annotated object labels to learn a model of the world --- traditionally with object detections and trajectory predictions or temporal bird's-eye-view (BEV) occupancy fields. However these annotations are expensive and typically limited to a set of predefined categories that do not cover everything we might encounter on the road. Instead we learn to perceive and forecast a continuous 4D (spatio-temporal) occupancy field with self-supervision from LiDAR data. This unsupervised world model can be easily and effectively transferred to downstream tasks. We tackle point cloud forecasting by adding a lightweight learned renderer and achieve state-of-the-art performance in Argoverse 2 nuScenes and KITTI. To further showcase its transferability we fine-tune our model for BEV semantic occupancy forecasting and show that it outperforms the fully supervised state-of-the-art especially when labeled data is scarce. Finally when compared to prior state-of-the-art on spatio-temporal geometric occupancy prediction our 4D world model achieves a much higher recall of objects from classes relevant to self-driving.

Cite

Text

Agro et al. "UnO: Unsupervised Occupancy Fields for Perception and Forecasting." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01373

Markdown

[Agro et al. "UnO: Unsupervised Occupancy Fields for Perception and Forecasting." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/agro2024cvpr-uno/) doi:10.1109/CVPR52733.2024.01373

BibTeX

@inproceedings{agro2024cvpr-uno,
  title     = {{UnO: Unsupervised Occupancy Fields for Perception and Forecasting}},
  author    = {Agro, Ben and Sykora, Quinlan and Casas, Sergio and Gilles, Thomas and Urtasun, Raquel},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {14487-14496},
  doi       = {10.1109/CVPR52733.2024.01373},
  url       = {https://mlanthology.org/cvpr/2024/agro2024cvpr-uno/}
}