Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks

Abstract

In this paper, we investigate whether it is possible to leverage information from multiple datasets when performing frame-based action recognition, which is an essential component of real-time activity monitoring systems. In particular, we investigate whether the training of an LSTM can benefit from pre-training or co-training on multiple datasets of related tasks when it uses non-transferred visual CNN features. A number of label mappings and multi-dataset training techniques are proposed and tested on three challenging kitchen activity datasets - Breakfast, 50 Salads and MPII Cooking 2. We show that transferring, by pre-training on similar datasets using label concatenation, delivers improved frame-based classification accuracy and faster training convergence than random initialisation.

Cite

Text

Perrett and Damen. "Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks." IEEE/CVF International Conference on Computer Vision Workshops, 2017. doi:10.1109/ICCVW.2017.161

Markdown

[Perrett and Damen. "Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks." IEEE/CVF International Conference on Computer Vision Workshops, 2017.](https://mlanthology.org/iccvw/2017/perrett2017iccvw-recurrent/) doi:10.1109/ICCVW.2017.161

BibTeX

@inproceedings{perrett2017iccvw-recurrent,
  title     = {{Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks}},
  author    = {Perrett, Toby and Damen, Dima},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2017},
  pages     = {1354-1362},
  doi       = {10.1109/ICCVW.2017.161},
  url       = {https://mlanthology.org/iccvw/2017/perrett2017iccvw-recurrent/}
}