Discriminative Hierarchical Rank Pooling for Activity Recognition
Abstract
We present hierarchical rank pooling, a video sequence encoding method for activity recognition. It consists of a network of rank pooling functions which captures the dynamics of rich convolutional neural network features within a video sequence. By stacking non-linear feature functions and rank pooling over one another, we obtain a high capacity dynamic encoding mechanism, which is used for action recognition. We present a method for jointly learning the video representation and activity classifier parameters. Our method obtains state-of-the art results on three important activity recognition benchmarks: 76.7% on Hollywood2, 66.9% on HMDB51 and, 91.4% on UCF101.
Cite
Text
Fernando et al. "Discriminative Hierarchical Rank Pooling for Activity Recognition." Conference on Computer Vision and Pattern Recognition, 2016. doi:10.1109/CVPR.2016.212Markdown
[Fernando et al. "Discriminative Hierarchical Rank Pooling for Activity Recognition." Conference on Computer Vision and Pattern Recognition, 2016.](https://mlanthology.org/cvpr/2016/fernando2016cvpr-discriminative/) doi:10.1109/CVPR.2016.212BibTeX
@inproceedings{fernando2016cvpr-discriminative,
title = {{Discriminative Hierarchical Rank Pooling for Activity Recognition}},
author = {Fernando, Basura and Anderson, Peter and Hutter, Marcus and Gould, Stephen},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2016},
doi = {10.1109/CVPR.2016.212},
url = {https://mlanthology.org/cvpr/2016/fernando2016cvpr-discriminative/}
}