Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation

Abstract

We describe an information-driven active selection approach to determine which detectors to deploy at which location in which frame of a video to minimize semantic class label uncertainty at every pixel, with the smallest computational cost that ensures a given uncertainty bound. We show minimal performance reduction compared to a "paragon" algorithm running all detectors at all locations in all frames, at a small fraction of the computational cost. Our method can handle uncertainty in the labeling mechanism, so it can handle both "oracles" (manual annotation) or noisy detectors (automated annotation).

Cite

Text

Karasev et al. "Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation." Conference on Computer Vision and Pattern Recognition, 2014. doi:10.1109/CVPR.2014.273

Markdown

[Karasev et al. "Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation." Conference on Computer Vision and Pattern Recognition, 2014.](https://mlanthology.org/cvpr/2014/karasev2014cvpr-active/) doi:10.1109/CVPR.2014.273

BibTeX

@inproceedings{karasev2014cvpr-active,
  title     = {{Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation}},
  author    = {Karasev, Vasiliy and Ravichandran, Avinash and Soatto, Stefano},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2014},
  doi       = {10.1109/CVPR.2014.273},
  url       = {https://mlanthology.org/cvpr/2014/karasev2014cvpr-active/}
}