Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation

Abstract

Current state-of-the-art approaches for Semi-supervised Video Object Segmentation (Semi-VOS) propagates information from previous frames to generate segmentation mask for the current frame. This results in high-quality segmentation across challenging scenarios such as changes in appearance and occlusion. But it also leads to unnecessary computations for stationary or slow-moving objects where the change across frames is minimal. In this work, we exploit this observation by using temporal information to quickly identify frames with minimal change and skip the heavyweight mask generation step. To realize this efficiency, we propose a novel dynamic network that estimates change across frames and decides which path -- computing a full network or reusing previous frame's feature -- to choose depending on the expected similarity. Experimental results show that our approach significantly improves inference speed without much accuracy degradation on challenging Semi-VOS datasets -- DAVIS 16, DAVIS 17, and YouTube-VOS. Furthermore, our approach can be applied to multiple Semi-VOS methods demonstrating its generality. The code is available in https://github.com/HYOJINPARK/Reuse VOS.

Cite

Text

Park et al. "Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00830

Markdown

[Park et al. "Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/park2021cvpr-learning/) doi:10.1109/CVPR46437.2021.00830

BibTeX

@inproceedings{park2021cvpr-learning,
  title     = {{Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation}},
  author    = {Park, Hyojin and Yoo, Jayeon and Jeong, Seohyeong and Venkatesh, Ganesh and Kwak, Nojun},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {8405-8414},
  doi       = {10.1109/CVPR46437.2021.00830},
  url       = {https://mlanthology.org/cvpr/2021/park2021cvpr-learning/}
}