PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View

Abstract

3D plane recovery from a single image can usually be divided into several subtasks of plane detection, segmentation, parameter estimation and possibly depth estimation. Previous works tend to solve it by either extending the RCNN-based segmentation network or the dense pixel embedding-based clustering framework. However, none of them tried to integrate above related subtasks into a unified framework but treated them separately and sequentially, which we suspect is potentially a main source of performance limitation for existing approaches. Motivated by this finding and the success of query-based learning in enriching reasoning among semantic entities, in this paper, we propose PlaneRecTR, a Transformer-based architecture, which for the first time unifies all subtasks related to single-view plane recovery with a single compact model. Extensive quantitative and qualitative experiments demonstrate that our proposed unified learning achieves mutual benefits across subtasks, obtaining a new state-of-the-art performance on public ScanNet and NYUv2-Plane datasets.

Cite

Text

Shi et al. "PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00860

Markdown

[Shi et al. "PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/shi2023iccv-planerectr/) doi:10.1109/ICCV51070.2023.00860

BibTeX

@inproceedings{shi2023iccv-planerectr,
  title     = {{PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View}},
  author    = {Shi, Jingjia and Zhi, Shuaifeng and Xu, Kai},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {9377-9386},
  doi       = {10.1109/ICCV51070.2023.00860},
  url       = {https://mlanthology.org/iccv/2023/shi2023iccv-planerectr/}
}