Instance-Level Video Segmentation from Object Tracks

Abstract

We address the problem of segmenting multiple object instances in complex videos. Our method does not require manual pixel-level annotation for training, and relies instead on readily-available object detectors or visual object tracking only. Given object bounding boxes at input, we cast video segmentation as a weakly-supervised learning problem. Our proposed objective combines (a) a discriminative clustering term for background segmentation, (b) a spectral clustering one for grouping pixels of same object instances, and (c) linear constraints enabling instance-level segmentation. We propose a convex relaxation of this problem and solve it efficiently using the Frank-Wolfe algorithm. We report results and compare our method to several baselines on a new video dataset for multi-instance person segmentation.

Cite

Text

Seguin et al. "Instance-Level Video Segmentation from Object Tracks." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.400

Markdown

[Seguin et al. "Instance-Level Video Segmentation from Object Tracks." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/seguin2016cvpr-instancelevel/) doi:10.1109/CVPR.2016.400

BibTeX

@inproceedings{seguin2016cvpr-instancelevel,
  title     = {{Instance-Level Video Segmentation from Object Tracks}},
  author    = {Seguin, Guillaume and Bojanowski, Piotr and Lajugie, Remi and Laptev, Ivan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2016},
  doi       = {10.1109/CVPR.2016.400},
  url       = {https://mlanthology.org/cvpr/2016/seguin2016cvpr-instancelevel/}
}