Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking

Luiten, Jonathon; Torr, Philip H. S.; Leibe, Bastian

doi:10.1109/ICCVW.2019.00088

Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking

Jonathon Luiten, Philip H. S. Torr, Bastian Leibe

ICCVW 2019 pp. 709-712

doi:10.1109/ICCVW.2019.00088 /iccvw/2019/luiten2019iccvw-video/

Abstract

Video Instance Segmentation (VIS) is the task of localizing all objects in a video, segmenting them, tracking them throughout the video and classifying them into a set of predefined classes. In this work, divide VIS into these four parts: detection, segmentation, tracking and classification. We then develop algorithms for performing each of these four sub tasks individually, and combine these into a complete solution for VIS. Our solution is an adaptation of UnOVOST, the current best performing algorithm for Unsupervised Video Object Segmentation, to this VIS task. We benchmark our algorithm on the 2019 YouTube-VIS Challenge, where we obtain first place with an mAP score of 46.7%.

PDF ICCVW Semantic Scholar

Cite

Text

Luiten et al. "Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00088

Markdown

[Luiten et al. "Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/luiten2019iccvw-video/) doi:10.1109/ICCVW.2019.00088

BibTeX

@inproceedings{luiten2019iccvw-video,
  title     = {{Video Instance Segmentation 2019: A Winning Approach for Combined Detection, Segmentation, Classification and Tracking}},
  author    = {Luiten, Jonathon and Torr, Philip H. S. and Leibe, Bastian},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {709-712},
  doi       = {10.1109/ICCVW.2019.00088},
  url       = {https://mlanthology.org/iccvw/2019/luiten2019iccvw-video/}
}