AIMS: All-Inclusive Multi-Level Segmentation for Anything
Abstract
Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved. In this paper, we propose a new task, All-Inclusive Multi-Level Segmentation (AIMS), which segments visual regions into three levels: part, entity, and relation (two entities with some semantic relationships). We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation. Specifically, we propose task complementarity, association, and prompt mask encoder for three-level predictions. Extensive experiments demonstrate the effectiveness and generalization capacity of our method compared to other state-of-the-art methods on a single dataset or the concurrent work on segment anything. We will make our code and training model publicly available.
Cite
Text
Qi et al. "AIMS: All-Inclusive Multi-Level Segmentation for Anything." Neural Information Processing Systems, 2023.Markdown
[Qi et al. "AIMS: All-Inclusive Multi-Level Segmentation for Anything." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/qi2023neurips-aims/)BibTeX
@inproceedings{qi2023neurips-aims,
title = {{AIMS: All-Inclusive Multi-Level Segmentation for Anything}},
author = {Qi, Lu and Kuen, Jason and Guo, Weidong and Gu, Jiuxiang and Lin, Zhe and Du, Bo and Xu, Yu and Yang, Ming-Hsuan},
booktitle = {Neural Information Processing Systems},
year = {2023},
url = {https://mlanthology.org/neurips/2023/qi2023neurips-aims/}
}