Understanding the Nature of First-Person Videos: Characterization and Classification Using Low-Level Features

Abstract

First-person view (FPV) video data is set to proliferate rapidly, due to many consumer wearable-camera devices coming onto the market. Research into FPV (or "egocentric") vision is also becoming more common in the computer vision community. However, it is still unclear what the fundamental characteristics of such data are. How is it really different from third-person view (TPV) data? Can all FPV data be treated the same? In this first attempt to approach these questions in a quantitative and empirical manner, we analyzed a meta-collection of 21 FPV and TPV datasets totaling more than 165 hours of video. We performed the first quantitative characterization of FPV videos over multiple datasets, encompassing virtually all available FPV datasets. Validating this characterization, linear classifiers trained on low-level features to perform FPV-versus-TPV classification achieved good baseline performance. Accuracy peaked at 81% for 2-minute clips, but 67% accuracy was achieved even with 1-second clips. Our low-level features are fast to compute and do not require annotation. Overall, our work uncovered insights regarding the basic nature and characteristics of FPV data.

Cite

Text

Tan et al. "Understanding the Nature of First-Person Videos: Characterization and Classification Using Low-Level Features." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2014. doi:10.1109/CVPRW.2014.85

Markdown

[Tan et al. "Understanding the Nature of First-Person Videos: Characterization and Classification Using Low-Level Features." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2014.](https://mlanthology.org/cvprw/2014/tan2014cvprw-understanding/) doi:10.1109/CVPRW.2014.85

BibTeX

@inproceedings{tan2014cvprw-understanding,
  title     = {{Understanding the Nature of First-Person Videos: Characterization and Classification Using Low-Level Features}},
  author    = {Tan, Cheston and Goh, Hanlin and Chandrasekhar, Vijay and Li, Liyuan and Lim, Joo-Hwee},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2014},
  pages     = {549-556},
  doi       = {10.1109/CVPRW.2014.85},
  url       = {https://mlanthology.org/cvprw/2014/tan2014cvprw-understanding/}
}