Multi-Class Semantic Video Segmentation with Exemplar-Based Object Reasoning

Abstract

We tackle the problem of semantic segmentation of dynamic scene in video sequences. We propose to incorporate foreground object information into pixel labeling by jointly reasoning semantic labels of super-voxels, object instance tracks and geometric relations between objects. We take an exemplar approach to object modeling by using a small set of object annotations and exploring the temporal consistency of object motion. After generating a set of moving object hypotheses, we design a CRF framework that jointly models the super voxel and object instances. The optimal semantic labeling is inferred by the MAP estimation of the model, which is solved by a single move-making based optimization procedure. We demonstrate the effectiveness of our method on three public datasets and show that our model can achieve superior or comparable results than the state of-the-art with less object-level supervision

Cite

Text

Liu et al. "Multi-Class Semantic Video Segmentation with Exemplar-Based Object Reasoning." IEEE/CVF Winter Conference on Applications of Computer Vision, 2015. doi:10.1109/WACV.2015.140

Markdown

[Liu et al. "Multi-Class Semantic Video Segmentation with Exemplar-Based Object Reasoning." IEEE/CVF Winter Conference on Applications of Computer Vision, 2015.](https://mlanthology.org/wacv/2015/liu2015wacv-multi/) doi:10.1109/WACV.2015.140

BibTeX

@inproceedings{liu2015wacv-multi,
  title     = {{Multi-Class Semantic Video Segmentation with Exemplar-Based Object Reasoning}},
  author    = {Liu, Buyu and He, Xuming and Gould, Stephen},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2015},
  pages     = {1014-1021},
  doi       = {10.1109/WACV.2015.140},
  url       = {https://mlanthology.org/wacv/2015/liu2015wacv-multi/}
}