Unsupervised Semantic Scene Labeling for Streaming Data

Abstract

We introduce an unsupervised semantic scene labeling approach that continuously learns and adapts semantic models discovered within a data stream. While closely related to unsupervised video segmentation, our algorithm is not designed to be an early video processing strategy that produces coherent over-segmentations, but instead, to directly learn higher-level semantic concepts. This is achieved with an ensemble-based approach, where each learner clusters data from a local window in the data stream. Overlapping local windows are processed and encoded in a graph structure to create a label mapping across windows and reconcile the labelings to reduce unsupervised learning noise. Additionally, we iteratively learn a merging threshold criteria from observed data similarities to automatically determine the number of learned labels without human provided parameters. Experiments show that our approach semantically labels video streams with a high degree of accuracy, and achieves a better balance of under and over-segmentation entropy than existing video segmentation algorithms given similar numbers of label outputs.

Cite

Text

Wigness and Iii. "Unsupervised Semantic Scene Labeling for Streaming Data." Conference on Computer Vision and Pattern Recognition, 2017. doi:10.1109/CVPR.2017.626

Markdown

[Wigness and Iii. "Unsupervised Semantic Scene Labeling for Streaming Data." Conference on Computer Vision and Pattern Recognition, 2017.](https://mlanthology.org/cvpr/2017/wigness2017cvpr-unsupervised/) doi:10.1109/CVPR.2017.626

BibTeX

@inproceedings{wigness2017cvpr-unsupervised,
  title     = {{Unsupervised Semantic Scene Labeling for Streaming Data}},
  author    = {Wigness, Maggie and Iii, John G. Rogers},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2017},
  doi       = {10.1109/CVPR.2017.626},
  url       = {https://mlanthology.org/cvpr/2017/wigness2017cvpr-unsupervised/}
}