From a Bird's Eye View to See: Joint Camera and Subject Registration Without the Camera Calibration

Abstract

We tackle a new problem of multi-view camera and subject registration in the bird's eye view (BEV) without pre-given camera calibration which promotes the multi-view subject registration problem to a new calibration-free stage. This greatly alleviates the limitation in many practical applications. However this is a very challenging problem since its only input is several RGB images from different first-person views (FPVs) without the BEV image and the calibration of the FPVs while the output is a unified plane aggregated from all views with the positions and orientations of both the subjects and cameras in a BEV. For this purpose we propose an end-to-end framework solving camera and subject registration together by taking advantage of their mutual dependence whose main idea is as below: i) creating a subject view-transform module (VTM) to project each pedestrian from FPV to a virtual BEV ii) deriving a multi-view geometry-based spatial alignment module (SAM) to estimate the relative camera pose in a unified BEV iii) selecting and refining the subject and camera registration results within the unified BEV. We collect a new large-scale synthetic dataset with rich annotations for training and evaluation. Additionally we also collect a real dataset for cross-domain evaluation. The experimental results show the remarkable effectiveness of our method. The code and proposed datasets are available at https://github.com/zekunqian/BEVSee.

Cite

Text

Qian et al. "From a Bird's Eye View to See: Joint Camera and Subject Registration Without the Camera Calibration." Conference on Computer Vision and Pattern Recognition, 2024.

Markdown

[Qian et al. "From a Bird's Eye View to See: Joint Camera and Subject Registration Without the Camera Calibration." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/qian2024cvpr-bird/)

BibTeX

@inproceedings{qian2024cvpr-bird,
  title     = {{From a Bird's Eye View to See: Joint Camera and Subject Registration Without the Camera Calibration}},
  author    = {Qian, Zekun and Han, Ruize and Feng, Wei and Wang, Song},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {863-873},
  url       = {https://mlanthology.org/cvpr/2024/qian2024cvpr-bird/}
}