CaMuViD: Calibration-Free Multi-View Detection
Abstract
Multi-view object detection in crowded environments presents significant challenges, particularly for occlusion management across multiple camera views. This paper introduces a novel approach that extends conventional multi-view detection to operate directly within each camera's image space. Our method finds objects bounding boxes for images from various perspectives without resorting to a bird's eye view (BEV) representation. Thus, our approach removes the need for camera calibration by leveraging a learnable architecture that facilitates flexible transformations and improves feature fusion across perspectives to increase detection accuracy. Our model achieves Multi-Object Detection Accuracy (MODA) scores of 95.0% and 96.5% on the Wildtrack and MultiviewX datasets, respectively, significantly advancing the state of the art in multi-view detection. Furthermore, it demonstrates robust performance even without ground truth annotations, highlighting its resilience and practicality in real-world applications. These results emphasize the effectiveness of our calibration-free, multi-view object detector.
Cite
Text
Daryani et al. "CaMuViD: Calibration-Free Multi-View Detection." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.00122Markdown
[Daryani et al. "CaMuViD: Calibration-Free Multi-View Detection." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/daryani2025cvpr-camuvid/) doi:10.1109/CVPR52734.2025.00122BibTeX
@inproceedings{daryani2025cvpr-camuvid,
title = {{CaMuViD: Calibration-Free Multi-View Detection}},
author = {Daryani, Amir Etefaghi and Bhutta, M. Usman Maqbool and Hernandez, Byron and Medeiros, Henry},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2025},
pages = {1220-1229},
doi = {10.1109/CVPR52734.2025.00122},
url = {https://mlanthology.org/cvpr/2025/daryani2025cvpr-camuvid/}
}