Robot-Centric Activity Recognition from First-Person RGB-D Videos

Abstract

We present a framework and algorithm to analyze first person RGBD videos captured from the robot while physically interacting with humans. Specifically, we explore reactions and interactions of persons facing a mobile robot from a robot centric view. This new perspective offers social awareness to the robots, enabling interesting applications. As far as we know, there is no public 3D dataset for this problem. Therefore, we record two multi-modal first-person RGBD datasets that reflect the setting we are analyzing. We use a humanoid and a non-humanoid robot equipped with a Kinect. Notably, the videos contain a high percentage of ego-motion due to the robot self-exploration as well as its reactions to the persons' interactions. We show that separating the descriptors extracted from ego-motion and independent motion areas, and using them both, allows us to achieve superior recognition results. Experiments show that our algorithm recognizes the activities effectively and outperforms other state-of-the-art methods on related tasks.

Cite

Text

Xia et al. "Robot-Centric Activity Recognition from First-Person RGB-D Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2015. doi:10.1109/WACV.2015.54

Markdown

[Xia et al. "Robot-Centric Activity Recognition from First-Person RGB-D Videos." IEEE/CVF Winter Conference on Applications of Computer Vision, 2015.](https://mlanthology.org/wacv/2015/xia2015wacv-robot/) doi:10.1109/WACV.2015.54

BibTeX

@inproceedings{xia2015wacv-robot,
  title     = {{Robot-Centric Activity Recognition from First-Person RGB-D Videos}},
  author    = {Xia, Lu and Gori, Ilaria and Aggarwal, Jake K. and Ryoo, Michael S.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2015},
  pages     = {357-364},
  doi       = {10.1109/WACV.2015.54},
  url       = {https://mlanthology.org/wacv/2015/xia2015wacv-robot/}
}