MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation

Abstract

Few-shot instance segmentation extends the few-shot learning paradigm to the instance segmentation task, which tries to segment instance objects from a query image with a few annotated examples of novel categories. Conventional approaches have attempted to address the task via prototype learning, known as point estimation. However, this mechanism depends on prototypes (e.g. mean of K-shot) for prediction, leading to performance instability. To overcome the disadvantage of the point estimation mechanism, we propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask, which is conditioned on an object region and K-shot information. Inspired by augmentation approaches that perturb data with Gaussian noise for populating low data density regions, we model the mask distribution with a diffusion probabilistic model. We also propose to utilize classifier-free guided mask sampling to integrate category information into the binary mask generation process. Without bells and whistles, our proposed method consistently outperforms state-of-the-art methods on both base and novel classes of the COCO dataset while simultaneously being more stable than existing methods. The source code is available at: https://github.com/minhquanlecs/MaskDiff.

Cite

Text

Le et al. "MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I3.28068

Markdown

[Le et al. "MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/le2024aaai-maskdiff/) doi:10.1609/AAAI.V38I3.28068

BibTeX

@inproceedings{le2024aaai-maskdiff,
  title     = {{MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model for Few-Shot Instance Segmentation}},
  author    = {Le, Minh-Quan and Nguyen, Tam V. and Le, Trung-Nghia and Do, Thanh-Toan and Do, Minh N. and Tran, Minh-Triet},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {2874-2881},
  doi       = {10.1609/AAAI.V38I3.28068},
  url       = {https://mlanthology.org/aaai/2024/le2024aaai-maskdiff/}
}