RepVF: A Unified Vector Fields Representation for Multi-Task 3D Perception

Abstract

Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms. We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: https://github.com/jbji/RepVF.

Cite

Text

Shen et al. "RepVF: A Unified Vector Fields Representation for Multi-Task 3D Perception." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73411-3_16

Markdown

[Shen et al. "RepVF: A Unified Vector Fields Representation for Multi-Task 3D Perception." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/shen2024eccv-repvf/) doi:10.1007/978-3-031-73411-3_16

BibTeX

@inproceedings{shen2024eccv-repvf,
  title     = {{RepVF: A Unified Vector Fields Representation for Multi-Task 3D Perception}},
  author    = {Shen, Jianbing and Li, Chunliang and Han, Wencheng and Yin, Junbo and Zhao, Sanyuan},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73411-3_16},
  url       = {https://mlanthology.org/eccv/2024/shen2024eccv-repvf/}
}