Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation

Shu Yang, Lu Zhang, Jinqing Qi, Huchuan Lu, Shuo Wang, Xiaoxing Zhang

ICCV 2021 pp. 1564-1573

doi:10.1109/ICCV48922.2021.00159 /iccv/2021/yang2021iccv-learning/

Abstract

How to make the appearance and motion information interact effectively to accommodate complex scenarios is a fundamental issue in flow-based zero-shot video object segmentation. In this paper, we propose an Attentive Multi-Modality Collaboration Network (AMC-Net) to utilize appearance and motion information uniformly. Specifically, AMC-Net fuses robust information from multi-modality features and promotes their collaboration in two stages. First, we propose a Multi-Modality Co-Attention Gate (MCG) on the bilateral encoder branches, in which a gate function is used to formulate co-attention scores for balancing the contributions of multi-modality features and suppressing the redundant and misleading information. Then, we propose a Motion Correction Module (MCM) with a visual-motion attention mechanism, which is constructed to emphasize the features of foreground objects by incorporating the spatio-temporal correspondence between appearance and motion cues. Extensive experiments on three public challenging benchmark datasets verify that our proposed network performs favorably against existing state-of-the-art methods via training with fewer data.

PDF ICCV Semantic Scholar

Cite

Text

Yang et al. "Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00159

Markdown

[Yang et al. "Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/yang2021iccv-learning/) doi:10.1109/ICCV48922.2021.00159

BibTeX

@inproceedings{yang2021iccv-learning,
  title     = {{Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation}},
  author    = {Yang, Shu and Zhang, Lu and Qi, Jinqing and Lu, Huchuan and Wang, Shuo and Zhang, Xiaoxing},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {1564-1573},
  doi       = {10.1109/ICCV48922.2021.00159},
  url       = {https://mlanthology.org/iccv/2021/yang2021iccv-learning/}
}