VPDETR: End-to-End Vanishing Point DEtection TRansformers

Abstract

In the field of vanishing point detection, previous works commonly relied on extracting and clustering straight lines or classifying candidate points as vanishing points. This paper proposes a novel end-to-end framework, called VPDETR (Vanishing Point DEtection TRansformer), that views vanishing point detection as a set prediction problem, applicable to both Manhattan and non-Manhattan world datasets. By using the positional embedding of anchor points as queries in Transformer decoders and dynamically updating them layer by layer, our method is able to directly input images and output their vanishing points without the need for explicit straight line extraction and candidate points sampling. Additionally, we introduce an orthogonal loss and a cross-prediction loss to improve accuracy on the Manhattan world datasets. Experimental results demonstrate that VPDETR achieves competitive performance compared to state-of-the-art methods, without requiring post-processing.

Cite

Text

Chen et al. "VPDETR: End-to-End Vanishing Point DEtection TRansformers." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I2.27881

Markdown

[Chen et al. "VPDETR: End-to-End Vanishing Point DEtection TRansformers." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/chen2024aaai-vpdetr/) doi:10.1609/AAAI.V38I2.27881

BibTeX

@inproceedings{chen2024aaai-vpdetr,
  title     = {{VPDETR: End-to-End Vanishing Point DEtection TRansformers}},
  author    = {Chen, Taiyan and Ying, Xianghua and Yang, Jinfa and Wang, Ruibin and Guo, Ruohao and Xing, Bowei and Shi, Ji},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {1192-1200},
  doi       = {10.1609/AAAI.V38I2.27881},
  url       = {https://mlanthology.org/aaai/2024/chen2024aaai-vpdetr/}
}