Towards the Computational Perception of Action
Abstract
Understanding observations of interacting objects requires one to reason about qualitative scene dynamics. For example, on observing a hand lifting a can, we may infer that an 'active' hand is applying an upwards force (by grasping) to lift a 'passive' can. Previously we presented a system that infers qualitative scene dynamics from the instantaneous motion of objects. However; since that analysis only considered single frames in isolation, there were often multiple interpretations for each frame. In this work we show how the dynamic information inferred at each frame can be integrated over time to reduce ambiguity. Our approach to integrating information is to extend our representation to describe objects by a set of properties or capabilities that are assumed to persist over time. Given this extended representation we find interpretations that require the smallest set(s) of properties over the whole image sequence.
Cite
Text
Mann and Jepson. "Towards the Computational Perception of Action." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1998. doi:10.1109/CVPR.1998.698694Markdown
[Mann and Jepson. "Towards the Computational Perception of Action." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1998.](https://mlanthology.org/cvpr/1998/mann1998cvpr-computational/) doi:10.1109/CVPR.1998.698694BibTeX
@inproceedings{mann1998cvpr-computational,
title = {{Towards the Computational Perception of Action}},
author = {Mann, Richard and Jepson, Allan D.},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {1998},
pages = {794-799},
doi = {10.1109/CVPR.1998.698694},
url = {https://mlanthology.org/cvpr/1998/mann1998cvpr-computational/}
}