Mask Transfiner for High-Quality Instance Segmentation
Abstract
Two-stage and query-based instance segmentation methods have achieved remarkable results. However, their segmented masks are still very coarse. In this paper, we present Mask Transfiner for high-quality and efficient instance segmentation. Instead of operating on regular dense tensors, our Mask Transfiner decomposes and represents the image regions as a quadtree. Our transformer-based approach only processes detected error-prone tree nodes and self-corrects their errors in parallel. While these sparse pixels only constitute a small proportion of the total number, they are critical to the final mask quality. This allows Mask Transfiner to predict highly accurate instance masks, at a low computational cost. Extensive experiments demonstrate that Mask Transfiner outperforms current instance segmentation methods on three popular benchmarks, significantly improving both two-stage and query-based frameworks by a large margin of +3.0 mask AP on COCO and BDD100K, and +6.6 boundary AP on Cityscapes. Our code and trained models are available at https://github.com/SysCV/transfiner.
Cite
Text
Ke et al. "Mask Transfiner for High-Quality Instance Segmentation." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.00437Markdown
[Ke et al. "Mask Transfiner for High-Quality Instance Segmentation." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/ke2022cvpr-mask/) doi:10.1109/CVPR52688.2022.00437BibTeX
@inproceedings{ke2022cvpr-mask,
title = {{Mask Transfiner for High-Quality Instance Segmentation}},
author = {Ke, Lei and Danelljan, Martin and Li, Xia and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {4412-4421},
doi = {10.1109/CVPR52688.2022.00437},
url = {https://mlanthology.org/cvpr/2022/ke2022cvpr-mask/}
}