Dual Embedding Learning for Video Instance Segmentation
Abstract
In this paper, we propose a novel framework to generate high-quality segmentation results in a two-stage style, aiming at video instance segmentation task which requires simultaneous detection, segmentation and tracking of instances. To address this multi-task efficiently, we opt to first select high-quality detection proposals in each frame. The categories of the proposals are calibrated with the global context of video. Then, each selected proposal is extended temporally by a bi-directional Instance-Pixel Dual-Tracker (IPDT) which synchronizes the tracking on both instance-level and pixel-level. The instance-level module concentrates on distinguishing the target instance from other objects while the pixel-level module focuses more on the local feature of the instance. Our proposed method achieved a competitive result of mAP 45.0% on the Youtube-VOS dataset, ranking the 3rd in Track 2 of the 2nd Large-scale Video Object Segmentation Challenge.
Cite
Text
Feng et al. "Dual Embedding Learning for Video Instance Segmentation." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00090Markdown
[Feng et al. "Dual Embedding Learning for Video Instance Segmentation." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/feng2019iccvw-dual/) doi:10.1109/ICCVW.2019.00090BibTeX
@inproceedings{feng2019iccvw-dual,
title = {{Dual Embedding Learning for Video Instance Segmentation}},
author = {Feng, Qianyu and Yang, Zongxin and Li, Peike and Wei, Yunchao and Yang, Yi},
booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
year = {2019},
pages = {717-720},
doi = {10.1109/ICCVW.2019.00090},
url = {https://mlanthology.org/iccvw/2019/feng2019iccvw-dual/}
}