How Shall We Evaluate Egocentric Action Recognition?

Abstract

Egocentric action analysis methods often assume that input videos are trimmed and hence they tend to focus on action classification rather than recognition. Consequently, adopted evaluation schemes are often unable to assess important properties of the desired action video segmentation output, which are deemed to be meaningful in real scenarios (e.g., oversegmentation and boundary localization precision). To overcome the limits of current evaluation methodologies, we propose a set of measures aimed to quantitatively and qualitatively assess the performance of egocentric action recognition methods. To improve exploitability of current action classification methods in the recognition scenario, we investigate how frame-wise predictions can be turned into action-based temporal video segmentations. Experiments on both synthetic and real data show that the proposed set of measures can help to improve evaluation and to drive the design of egocentric action recognition methods.

Cite

Text

Furnari et al. "How Shall We Evaluate Egocentric Action Recognition?." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.280

Markdown

[Furnari et al. "How Shall We Evaluate Egocentric Action Recognition?." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/furnari2017iccvw-we/) doi:10.1109/ICCVW.2017.280

BibTeX

@inproceedings{furnari2017iccvw-we,
  title     = {{How Shall We Evaluate Egocentric Action Recognition?}},
  author    = {Furnari, Antonino and Battiato, Sebastiano and Farinella, Giovanni Maria},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {2373-2382},
  doi       = {10.1109/ICCVW.2017.280},
  url       = {https://mlanthology.org/iccvw/2017/furnari2017iccvw-we/}
}