SOIT: Segmenting Objects with Instance-Aware Transformers
Abstract
This paper presents an end-to-end instance segmentation framework, termed SOIT, that Segments Objects with Instance-aware Transformers. Inspired by DETR, our method views instance segmentation as a direct set prediction problem and effectively removes the need for many hand-crafted components like RoI cropping, one-to-many label assignment, and non-maximum suppression (NMS). In SOIT, multiple queries are learned to directly reason a set of object embeddings of semantic category, bounding-box location, and pixel-wise mask in parallel under the global image context. The class and bounding-box can be easily embedded by a fixed-length vector. The pixel-wise mask, especially, is embedded by a group of parameters to construct a lightweight instance-aware transformer. Afterward, a full-resolution mask is produced by the instance-aware transformer without involving any RoI-based operation. Overall, SOIT introduces a simple single-stage instance segmentation framework that is both RoI- and NMS-free. Experimental results on the MS COCO dataset demonstrate that SOIT outperforms state-of-the-art instance segmentation approaches significantly. Moreover, the joint learning of multiple tasks in a unified query embedding can also substantially improve the detection performance. Code is available at https://github.com/yuxiaodongHRI/SOIT.
Cite
Text
Yu et al. "SOIT: Segmenting Objects with Instance-Aware Transformers." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I3.20227Markdown
[Yu et al. "SOIT: Segmenting Objects with Instance-Aware Transformers." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/yu2022aaai-soit/) doi:10.1609/AAAI.V36I3.20227BibTeX
@inproceedings{yu2022aaai-soit,
title = {{SOIT: Segmenting Objects with Instance-Aware Transformers}},
author = {Yu, Xiaodong and Shi, Dahu and Wei, Xing and Ren, Ye and Ye, Tingqun and Tan, Wenming},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2022},
pages = {3188-3196},
doi = {10.1609/AAAI.V36I3.20227},
url = {https://mlanthology.org/aaai/2022/yu2022aaai-soit/}
}