Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module

Abstract

Lane detection requires adequate global information due to the simplicity of lane line features and changeable road scenes. In this paper, we propose a novel lane detection Transformer based on multi-frame input to regress the parameters of lanes under a lane shape modeling. We design a Multi-frame Horizontal and Vertical Attention (MHVA) module to obtain more global features and use Visual Transformer (VT) module to get ""lane tokens"" with interaction information of lane instances. Extensive experiments on two public datasets show that our model can achieve state-of-art results on VIL-100 dataset and comparable performance on Tusimple dataset. In addition, our model runs at 46 fps on multi-frame data while using few parameters, indicating the feasibility and practicability in real-time self-driving applications of our proposed method.

Cite

Text

Zhang et al. "Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19842-7_1

Markdown

[Zhang et al. "Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/zhang2022eccv-lane/) doi:10.1007/978-3-031-19842-7_1

BibTeX

@inproceedings{zhang2022eccv-lane,
  title     = {{Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module}},
  author    = {Zhang, Han and Gu, Yunchao and Wang, Xinliang and Pan, Junjun and Wang, Minghui},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2022},
  doi       = {10.1007/978-3-031-19842-7_1},
  url       = {https://mlanthology.org/eccv/2022/zhang2022eccv-lane/}
}