PointBeV: A Sparse Approach for BeV Predictions

Abstract

Bird's-eye View (BeV) representations have emerged as the de-facto shared space in driving applications offering a unified space for sensor data fusion and supporting various downstream tasks. However conventional models use grids with fixed resolution and range and face computational inefficiencies due to the uniform allocation of resources across all cells. To address this we propose PointBeV a novel sparse BeV segmentation model operating on sparse BeV cells instead of dense grids. This approach offers precise control over memory usage enabling the use of long temporal contexts and accommodating memory-constrained platforms. PointBeV employs an efficient two-pass strategy for training enabling focused computation on regions of interest. At inference time it can be used with various memory/performance trade-offs and flexibly adjusts to new specific use cases. PointBeV achieves state-of-the-art results on the nuScenes dataset for vehicle pedestrian and lane segmentation showcasing superior performance in static and temporal settings despite being trained solely with sparse signals. We release our code with two new efficient modules used in the architecture: Sparse Feature Pulling designed for the effective extraction of features from images to BeV and Submanifold Attention which enables efficient temporal modeling. The code is available at https://github.com/valeoai/PointBeV.

Cite

Text

Chambon et al. "PointBeV: A Sparse Approach for BeV Predictions." Conference on Computer Vision and Pattern Recognition, 2024.

Markdown

[Chambon et al. "PointBeV: A Sparse Approach for BeV Predictions." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/chambon2024cvpr-pointbev/)

BibTeX

@inproceedings{chambon2024cvpr-pointbev,
  title     = {{PointBeV: A Sparse Approach for BeV Predictions}},
  author    = {Chambon, Loick and Zablocki, Eloi and Chen, Mickaël and Bartoccioni, Florent and Pérez, Patrick and Cord, Matthieu},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {15195-15204},
  url       = {https://mlanthology.org/cvpr/2024/chambon2024cvpr-pointbev/}
}