Understanding Egocentric Activities

Abstract

We present a method to analyze daily activities, such as meal preparation, using video from an egocentric camera. Our method performs inference about activities, actions, hands, and objects. Daily activities are a challenging domain for activity recognition which are well-suited to an egocentric approach. In contrast to previous activity recognition methods, our approach does not require pre-trained detectors for objects and hands. Instead we demonstrate the ability to learn a hierarchical model of an activity by exploiting the consistent appearance of objects, hands, and actions that results from the egocentric context. We show that joint modeling of activities, actions, and objects leads to superior performance in comparison to the case where they are considered independently. We introduce a novel representation of actions based on object-hand interactions and experimentally demonstrate the superior performance of our representation in comparison to standard activity representations such as bag of words.

Cite

Text

Fathi et al. "Understanding Egocentric Activities." IEEE/CVF International Conference on Computer Vision, 2011. doi:10.1109/ICCV.2011.6126269

Markdown

[Fathi et al. "Understanding Egocentric Activities." IEEE/CVF International Conference on Computer Vision, 2011.](https://mlanthology.org/iccv/2011/fathi2011iccv-understanding/) doi:10.1109/ICCV.2011.6126269

BibTeX

@inproceedings{fathi2011iccv-understanding,
  title     = {{Understanding Egocentric Activities}},
  author    = {Fathi, Alireza and Farhadi, Ali and Rehg, James M.},
  booktitle = {IEEE/CVF International Conference on Computer Vision},
  year      = {2011},
  pages     = {407-414},
  doi       = {10.1109/ICCV.2011.6126269},
  url       = {https://mlanthology.org/iccv/2011/fathi2011iccv-understanding/}
}