"The Pedestrian Next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping
Abstract
Estimating a semantically segmented bird's-eye-view (BEV) map from a single image has become a popular technique for autonomous control and navigation. However, they show an increase in localization error with distance from the camera. While such an increase in error is entirely expected - localization is harder at distance - much of the drop in performance can be attributed to the cues used by current texture-based models, in particular, they make heavy use of object-ground intersections (such as shadows), which become increasingly sparse and uncertain for distant objects. In this work, we address these shortcomings in BEV-mapping by learning the spatial relationship between objects in a scene. We propose a graph neural network which predicts BEV objects from a monocular image by spatially reasoning about an object within the context of other objects. Our approach sets a new state-of-the-art in BEV estimation from monocular images across three large-scale datasets, including a 50% relative improvement for objects on nuScenes.
Cite
Text
Saha et al. ""The Pedestrian Next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping." Conference on Computer Vision and Pattern Recognition, 2022.Markdown
[Saha et al. ""The Pedestrian Next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/saha2022cvpr-pedestrian/)BibTeX
@inproceedings{saha2022cvpr-pedestrian,
title = {{"The Pedestrian Next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping}},
author = {Saha, Avishkar and Mendez, Oscar and Russell, Chris and Bowden, Richard},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {19528-19537},
url = {https://mlanthology.org/cvpr/2022/saha2022cvpr-pedestrian/}
}