Video-Based Action Recognition Using Dimension Reduction of Deep Covariance Trajectories

Abstract

Convolutional Neural Networks (CNNs) have been very successful in extracting discriminative features from video data. These deep features can be summarized using covariance descriptors for further analysis. However, due to large number of potential features, the covariance descriptors are often very high dimensional. To facilitate large scale data analysis, we propose a novel, metric-based dimension-reduction technique that reduces large covariances to small ones. Then, we represent videos as trajectories on the space of covariance matrices, or symmetric-positive definite matrices (SPDMs), and use a Riemannian metric on this space to quantify differences across these trajectories. These distance features can then be used for classification of video sequences. We illustrate this comprehensive framework using data from the UCF11 dataset for action recognition, with classification rates that match or outperform state-of-the-art techniques.

Cite

Text

Dai and Srivastava. "Video-Based Action Recognition Using Dimension Reduction of Deep Covariance Trajectories." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019. doi:10.1109/CVPRW.2019.00087

Markdown

[Dai and Srivastava. "Video-Based Action Recognition Using Dimension Reduction of Deep Covariance Trajectories." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.](https://mlanthology.org/cvprw/2019/dai2019cvprw-videobased/) doi:10.1109/CVPRW.2019.00087

BibTeX

@inproceedings{dai2019cvprw-videobased,
  title     = {{Video-Based Action Recognition Using Dimension Reduction of Deep Covariance Trajectories}},
  author    = {Dai, Mengyu and Srivastava, Anuj},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2019},
  pages     = {611-620},
  doi       = {10.1109/CVPRW.2019.00087},
  url       = {https://mlanthology.org/cvprw/2019/dai2019cvprw-videobased/}
}