Action Recognition in Still Images Using Word Embeddings from Natural Language Descriptions

Sharma, Karan; Kumar, Arun C. S.; Bhandarkar, Suchendra M.

doi:10.1109/WACVW.2017.17

Action Recognition in Still Images Using Word Embeddings from Natural Language Descriptions

Karan Sharma, Arun C. S. Kumar, Suchendra M. Bhandarkar

WACVW 2017 pp. 58-66

doi:10.1109/WACVW.2017.17 /wacvw/2017/sharma2017wacvw-action/

Abstract

Detecting actions or verbs in still images is a challenging problem for a variety of reasons such as the absence of temporal information and polysemy of verbs which lead to difficulty in generating large verb datasets. In this paper, we propose to first detect the prominent objects in the image and then infer the relevant actions or verbs using Natural Language Processing (NLP)-based techniques. The proposed scheme obviates the need for training and using visual action detectors on images, an approach which tends to be error-prone and computationally intensive. This paper provides a valuable insight in that the detection of a few significant (i.e., top) objects in an image allows one to extract or infer the relevant actions or verbs in the image. To this end, we propose NLP-based approaches relying on the word2vec and the Object-Verb-Object triplet models for predicting the actions from top-object detections and also analyze their nuances. Our experimental results show that verbs can be reliably and efficiently detected by exploiting the top object detections in an image.

WACVW Semantic Scholar

Cite

Text

Sharma et al. "Action Recognition in Still Images Using Word Embeddings from Natural Language Descriptions." IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2017. doi:10.1109/WACVW.2017.17

Markdown

[Sharma et al. "Action Recognition in Still Images Using Word Embeddings from Natural Language Descriptions." IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2017.](https://mlanthology.org/wacvw/2017/sharma2017wacvw-action/) doi:10.1109/WACVW.2017.17

BibTeX

@inproceedings{sharma2017wacvw-action,
  title     = {{Action Recognition in Still Images Using Word Embeddings from Natural Language Descriptions}},
  author    = {Sharma, Karan and Kumar, Arun C. S. and Bhandarkar, Suchendra M.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision Workshops},
  year      = {2017},
  pages     = {58-66},
  doi       = {10.1109/WACVW.2017.17},
  url       = {https://mlanthology.org/wacvw/2017/sharma2017wacvw-action/}
}