PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View
Abstract
3D plane recovery from a single image can usually be divided into several subtasks of plane detection, segmentation, parameter estimation and possibly depth estimation. Previous works tend to solve it by either extending the RCNN-based segmentation network or the dense pixel embedding-based clustering framework. However, none of them tried to integrate above related subtasks into a unified framework but treated them separately and sequentially, which we suspect is potentially a main source of performance limitation for existing approaches. Motivated by this finding and the success of query-based learning in enriching reasoning among semantic entities, in this paper, we propose PlaneRecTR, a Transformer-based architecture, which for the first time unifies all subtasks related to single-view plane recovery with a single compact model. Extensive quantitative and qualitative experiments demonstrate that our proposed unified learning achieves mutual benefits across subtasks, obtaining a new state-of-the-art performance on public ScanNet and NYUv2-Plane datasets.
Cite
Text
Shi et al. "PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00860Markdown
[Shi et al. "PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/shi2023iccv-planerectr/) doi:10.1109/ICCV51070.2023.00860BibTeX
@inproceedings{shi2023iccv-planerectr,
title = {{PlaneRecTR: Unified Query Learning for 3D Plane Recovery from a Single View}},
author = {Shi, Jingjia and Zhi, Shuaifeng and Xu, Kai},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {9377-9386},
doi = {10.1109/ICCV51070.2023.00860},
url = {https://mlanthology.org/iccv/2023/shi2023iccv-planerectr/}
}