RayFusion: Ray Fusion Enhanced Collaborative Visual Perception

Abstract

Collaborative visual perception methods have gained widespread attention in the autonomous driving community in recent years due to their ability to address sensor limitation problems. However, the absence of explicit depth information often makes it difficult for camera-based perception systems, e.g., 3D object detection, to generate accurate predictions. To alleviate the ambiguity in depth estimation, we propose RayFusion, a ray-based fusion method for collaborative visual perception. Using ray occupancy information from collaborators, RayFusion reduces redundancy and false positive predictions along camera rays, enhancing the detection performance of purely camera-based collaborative perception systems. Comprehensive experiments show that our method consistently outperforms existing state-of-the-art models, substantially advancing the performance of collaborative visual perception. Our code will be made publicly available.

Cite

Text

Wang et al. "RayFusion: Ray Fusion Enhanced Collaborative Visual Perception." Advances in Neural Information Processing Systems, 2025.

Markdown

[Wang et al. "RayFusion: Ray Fusion Enhanced Collaborative Visual Perception." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/wang2025neurips-rayfusion/)

BibTeX

@inproceedings{wang2025neurips-rayfusion,
  title     = {{RayFusion: Ray Fusion Enhanced Collaborative Visual Perception}},
  author    = {Wang, Shaohong and Lu, Bin and Xiao, Xinyu and Zhong, Hanzhi and Pang, Bowen and Wang, Tong and Xiang, Zhiyu and Shan, Hangguan and Liu, Eryun},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/wang2025neurips-rayfusion/}
}