Towards an Unequivocal Representation of Actions
Abstract
This work introduces verb-only representations for actions and interactions; the problem of describing similar motions (e.g. `open door', `open cupboard'), and distinguish differing ones (e.g. `open door' vs `open bottle') using verb-only labels. Current approaches neglect legitimate semantic ambiguities and class overlaps between verbs (Fig. 1), relying on the objects to disambiguate interactions. We deviate from single-verb labels and introduce a mapping between observations and multiple verb labels -- in order to create an Unequivocal Representation of Actions. The new representation benefits from increased vocabulary and a soft assignment to an enriched space of verb labels. We learn these representations as multi-output regression, using a two-stream fusion CNN. The proposed approach outperforms conventional single-verb labels (also known as majority voting) on three egocentric datasets for both recognition and retrieval.
Cite
Text
Wray et al. "Towards an Unequivocal Representation of Actions." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.Markdown
[Wray et al. "Towards an Unequivocal Representation of Actions." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018.](https://mlanthology.org/cvprw/2018/wray2018cvprw-unequivocal/)BibTeX
@inproceedings{wray2018cvprw-unequivocal,
title = {{Towards an Unequivocal Representation of Actions}},
author = {Wray, Michael and Moltisanti, Davide and Damen, Dima},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2018},
pages = {1127-1131},
url = {https://mlanthology.org/cvprw/2018/wray2018cvprw-unequivocal/}
}