Instance-Level Video Segmentation from Object Tracks
Abstract
We address the problem of segmenting multiple object instances in complex videos. Our method does not require manual pixel-level annotation for training, and relies instead on readily-available object detectors or visual object tracking only. Given object bounding boxes at input, we cast video segmentation as a weakly-supervised learning problem. Our proposed objective combines (a) a discriminative clustering term for background segmentation, (b) a spectral clustering one for grouping pixels of same object instances, and (c) linear constraints enabling instance-level segmentation. We propose a convex relaxation of this problem and solve it efficiently using the Frank-Wolfe algorithm. We report results and compare our method to several baselines on a new video dataset for multi-instance person segmentation.
Cite
Text
Seguin et al. "Instance-Level Video Segmentation from Object Tracks." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.400Markdown
[Seguin et al. "Instance-Level Video Segmentation from Object Tracks." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/seguin2016cvpr-instancelevel/) doi:10.1109/CVPR.2016.400BibTeX
@inproceedings{seguin2016cvpr-instancelevel,
title = {{Instance-Level Video Segmentation from Object Tracks}},
author = {Seguin, Guillaume and Bojanowski, Piotr and Lajugie, Remi and Laptev, Ivan},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2016},
doi = {10.1109/CVPR.2016.400},
url = {https://mlanthology.org/cvpr/2016/seguin2016cvpr-instancelevel/}
}