OccFeat: Self-Supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Abstract
We introduce a self-supervised pretraining method, called OccFeat, for camera-only Bird's-Eye-View (BEV) segmentation networks. With OccFeat, we pretrain a BEV network via occupancy prediction and feature distillation tasks. Occupancy prediction provides a 3D geometric understanding of the scene to the model. However, the geometry learned is class-agnostic. Hence, we add semantic information to the model in the 3D space through distillation from a self-supervised pretrained image foundation model. Models pretrained with our method exhibit improved BEV semantic segmentation performance, particularly in low- data scenarios. Moreover, empirical results affirm the efficacy of integrating feature distillation with 3D occupancy prediction in our pretraining approach.
Cite
Text
Sirko-Galouchenko et al. "OccFeat: Self-Supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024. doi:10.1109/CVPRW63382.2024.00452Markdown
[Sirko-Galouchenko et al. "OccFeat: Self-Supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2024.](https://mlanthology.org/cvprw/2024/sirkogalouchenko2024cvprw-occfeat/) doi:10.1109/CVPRW63382.2024.00452BibTeX
@inproceedings{sirkogalouchenko2024cvprw-occfeat,
title = {{OccFeat: Self-Supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks}},
author = {Sirko-Galouchenko, Sophia and Boulch, Alexandre and Gidaris, Spyros and Bursuc, Andrei and Vobecký, Antonín and Pérez, Patrick and Marlet, Renaud},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2024},
pages = {4493-4503},
doi = {10.1109/CVPRW63382.2024.00452},
url = {https://mlanthology.org/cvprw/2024/sirkogalouchenko2024cvprw-occfeat/}
}