State-Aware Tracker for Real-Time Video Object Segmentation

Abstract

In this work, we address the task of semi-supervised video object segmentation (VOS) and explore how to make efficient use of video property to tackle the challenge of semi-supervision. We propose a novel pipeline called State-Aware Tracker (SAT), which can produce accurate segmentation results with real-time speed. For higher efficiency, SAT takes advantage of the inter-frame consistency and deals with each target object as a tracklet. For more stable and robust performance over video sequences, SAT gets awareness for each state and makes self-adaptation via two feedback loops. One loop assists SAT in generating more stable tracklets. The other loop helps to construct a more robust and holistic target representation. SAT achieves a promising result of 72.3% J&F mean with 39 FPS on DAVIS 2017-Val dataset, which shows a decent trade-off between efficiency and accuracy.

Cite

Text

Chen et al. "State-Aware Tracker for Real-Time Video Object Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. doi:10.1109/CVPR42600.2020.00940

Markdown

[Chen et al. "State-Aware Tracker for Real-Time Video Object Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.](https://mlanthology.org/cvpr/2020/chen2020cvpr-stateaware/) doi:10.1109/CVPR42600.2020.00940

BibTeX

@inproceedings{chen2020cvpr-stateaware,
  title     = {{State-Aware Tracker for Real-Time Video Object Segmentation}},
  author    = {Chen, Xi and Li, Zuoxin and Yuan, Ye and Yu, Gang and Shen, Jianxin and Qi, Donglian},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2020},
  doi       = {10.1109/CVPR42600.2020.00940},
  url       = {https://mlanthology.org/cvpr/2020/chen2020cvpr-stateaware/}
}