Understanding Egocentric Activities
Abstract
We present a method to analyze daily activities, such as meal preparation, using video from an egocentric camera. Our method performs inference about activities, actions, hands, and objects. Daily activities are a challenging domain for activity recognition which are well-suited to an egocentric approach. In contrast to previous activity recognition methods, our approach does not require pre-trained detectors for objects and hands. Instead we demonstrate the ability to learn a hierarchical model of an activity by exploiting the consistent appearance of objects, hands, and actions that results from the egocentric context. We show that joint modeling of activities, actions, and objects leads to superior performance in comparison to the case where they are considered independently. We introduce a novel representation of actions based on object-hand interactions and experimentally demonstrate the superior performance of our representation in comparison to standard activity representations such as bag of words.
Cite
Text
Fathi et al. "Understanding Egocentric Activities." IEEE/CVF International Conference on Computer Vision, 2011. doi:10.1109/ICCV.2011.6126269Markdown
[Fathi et al. "Understanding Egocentric Activities." IEEE/CVF International Conference on Computer Vision, 2011.](https://mlanthology.org/iccv/2011/fathi2011iccv-understanding/) doi:10.1109/ICCV.2011.6126269BibTeX
@inproceedings{fathi2011iccv-understanding,
title = {{Understanding Egocentric Activities}},
author = {Fathi, Alireza and Farhadi, Ali and Rehg, James M.},
booktitle = {IEEE/CVF International Conference on Computer Vision},
year = {2011},
pages = {407-414},
doi = {10.1109/ICCV.2011.6126269},
url = {https://mlanthology.org/iccv/2011/fathi2011iccv-understanding/}
}