Multiclass Semantic Video Segmentation with Object-Level Active Inference
Abstract
We address the problem of integrating object reasoning with supervoxel labeling in multiclass semantic video segmentation. To this end, we first propose an object-augmented dense CRF in spatio-temporal domain, which captures long-range dependency between supervoxels, and imposes consistency between object and supervoxel labels. We develop an efficient mean field inference algorithm to jointly infer the supervoxel labels, object activations and their occlusion relations for a moderate number of object proposals. To scale up our method, we adopt an active inference strategy to improve the efficiency, which adaptively selects object subgraphs in the object-augmented dense CRF. We formulate the problem as a Markov Decision Process, which learns an approximate optimal policy based on a reward of accuracy improvement and a set of well-designed model and input features. We evaluate our method on three publicly available multiclass video semantic segmentation datasets and demonstrate superior efficiency and accuracy.
Cite
Text
Liu and He. "Multiclass Semantic Video Segmentation with Object-Level Active Inference." Conference on Computer Vision and Pattern Recognition, 2015. doi:10.1109/CVPR.2015.7299057Markdown
[Liu and He. "Multiclass Semantic Video Segmentation with Object-Level Active Inference." Conference on Computer Vision and Pattern Recognition, 2015.](https://mlanthology.org/cvpr/2015/liu2015cvpr-multiclass/) doi:10.1109/CVPR.2015.7299057BibTeX
@inproceedings{liu2015cvpr-multiclass,
title = {{Multiclass Semantic Video Segmentation with Object-Level Active Inference}},
author = {Liu, Buyu and He, Xuming},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2015},
doi = {10.1109/CVPR.2015.7299057},
url = {https://mlanthology.org/cvpr/2015/liu2015cvpr-multiclass/}
}