Learning 3D Action Models from a Few 2D Videos for View Invariant Action Recognition

Abstract

Most existing approaches for learning action models work by extracting suitable low-level features and then training appropriate classifiers. Such approaches require large amounts of training data and do not generalize well to variations in viewpoint, scale and across datasets. Some work has been done recently to learn multi-view action models from Mocap data, but obtaining such data is time consuming and requires costly infrastructure. We present a method that addresses both these issues by learning action models from just a few video training samples. We model each action as a sequence of primitive actions, represented as functions which transform the actor's state. We formulate model learning as a curve-fitting problem, and present a novel algorithm for learning human actions by lifting 2D annotations of a few keyposes to 3D and interpolating between them. Actions are inferred by sampling the models and accumulating the feature weights learned discriminatively using a latent state Perceptron algorithm. We show results comparable to state-of-art on the standard Weizmann dataset, with a much smaller train:test ratio, and also in datasets for visual gesture recognition and cluttered grocery store environments.

Cite

Text

Natarajan et al. "Learning 3D Action Models from a Few 2D Videos for View Invariant Action Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2010. doi:10.1109/CVPR.2010.5539876

Markdown

[Natarajan et al. "Learning 3D Action Models from a Few 2D Videos for View Invariant Action Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2010.](https://mlanthology.org/cvpr/2010/natarajan2010cvpr-learning/) doi:10.1109/CVPR.2010.5539876

BibTeX

@inproceedings{natarajan2010cvpr-learning,
  title     = {{Learning 3D Action Models from a Few 2D Videos for View Invariant Action Recognition}},
  author    = {Natarajan, Pradeep and Singh, Vivek Kumar and Nevatia, Ram},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2010},
  pages     = {2006-2013},
  doi       = {10.1109/CVPR.2010.5539876},
  url       = {https://mlanthology.org/cvpr/2010/natarajan2010cvpr-learning/}
}