Assisted Video Object Labeling by Joint Tracking of Regions and Keypoints

Abstract

Manual labeling of objects in videos is a tedious task. We present an approach which automatically propagates the labels from a single frame to the next ones. We tackle the challenging problem of tracking segmented regions by combining keypoint tracking with an advanced multiple region matching strategy, based on inclusion similarity and connected regions. We ran experiments on a 101 frame driving video sequence for which we produced the corresponding hand- labeled groundtruth. We make this valuable dataset available for the research community. We show our technique can accommodate variations in segmentation (and correct them), even in presence of multiple independent motions and partial occlusion. Results show that most of the labeled pixels can be correctly propagated even after a hundred frames. The performance of this automatic propagation mechanism over many frames can greatly reduce the user effort in the task of video object labeling.

Cite

Text

Fauqueur et al. "Assisted Video Object Labeling by Joint Tracking of Regions and Keypoints." IEEE/CVF International Conference on Computer Vision, 2007. doi:10.1109/ICCV.2007.4409124

Markdown

[Fauqueur et al. "Assisted Video Object Labeling by Joint Tracking of Regions and Keypoints." IEEE/CVF International Conference on Computer Vision, 2007.](https://mlanthology.org/iccv/2007/fauqueur2007iccv-assisted/) doi:10.1109/ICCV.2007.4409124

BibTeX

@inproceedings{fauqueur2007iccv-assisted,
  title     = {{Assisted Video Object Labeling by Joint Tracking of Regions and Keypoints}},
  author    = {Fauqueur, Julien and Brostow, Gabriel J. and Cipolla, Roberto},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {2007},
  pages     = {1-7},
  doi       = {10.1109/ICCV.2007.4409124},
  url       = {https://mlanthology.org/iccv/2007/fauqueur2007iccv-assisted/}
}