GeoGraph: Graph-Based Multi-View Object Detection with Geometric Cues End-to-End

Abstract

In this paper, we propose an end-to-end learnable approach that detects static urban objects from multiple views, re-identifies instances, and finally assigns a geographic position per object. Our method relies on a Graph Neural Network (GNN) to, detect all objects and out-put their geographic positions given images and approximate camera poses as input. Our GNN simultaneously models relative pose and image evidence and is further able to deal with an arbitrary number of input views. Our method is robust to occlusion, with a similar appearance of neighboring objects, and severe changes in viewpoints by jointly reasoning about visual image appearance and relative pose. Experimental evaluation on two challenging, large-scale datasets and comparison with state-of-the-art methods show significant and systematic improvements both in accuracy and efficiency, with 2-6% gain in the detection and re-IDaverage precision as well as 8x reduction of training time.

Cite

Text

Nassar et al. "GeoGraph: Graph-Based Multi-View Object Detection with Geometric Cues End-to-End." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58571-6_29

Markdown

[Nassar et al. "GeoGraph: Graph-Based Multi-View Object Detection with Geometric Cues End-to-End." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/nassar2020eccv-geograph/) doi:10.1007/978-3-030-58571-6_29

BibTeX

@inproceedings{nassar2020eccv-geograph,
  title     = {{GeoGraph: Graph-Based Multi-View Object Detection with Geometric Cues End-to-End}},
  author    = {Nassar, Ahmed Samy and D’Aronco, Stefano and Lefèvre, Sébastien and Wegner, Jan D.},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58571-6_29},
  url       = {https://mlanthology.org/eccv/2020/nassar2020eccv-geograph/}
}