Human Motion Analysis with Deep Metric Learning

Abstract

Effectively measuring the similarity between two human motions is necessary for several computer vision tasks such as gait analysis, person identification and action retrieval. Nevertheless, we believe that traditional approaches such as L2 distance or Dynamic Time Warping based on hand-crafted local pose metrics fail to appropriately capture the semantic relationship across motions and, as such, are not suitable for being employed as metrics within these tasks. This work addresses this limitation by means of a triplet-based deep metric learning specifically tailored to deal with human motion data, in particular with the problem of varying input size and computationally expensive hard negative mining due to motion pair alignment. Specifically, we propose (1) a novel metric learning objective based on a triplet architecture and Maximum Mean Discrepancy; as well as, (2) a novel deep architecture based on attentive recurrent neural networks. One benefit of our objective function is that it enforces a better separation within the learned embedding space of the different motion categories by means of the associated distribution moments. At the same time, our attentive recurrent neural network allows processing varying input sizes to a fixed size of embedding while learning to focus on those motion parts that are semantically distinctive. Our experiments on two different datasets demonstrate significant improvements over conventional human motion metrics.

Cite

Text

Coskun et al. "Human Motion Analysis with Deep Metric Learning." Proceedings of the European Conference on Computer Vision (ECCV), 2018. doi:10.1007/978-3-030-01264-9_41

Markdown

[Coskun et al. "Human Motion Analysis with Deep Metric Learning." Proceedings of the European Conference on Computer Vision (ECCV), 2018.](https://mlanthology.org/eccv/2018/coskun2018eccv-human/) doi:10.1007/978-3-030-01264-9_41

BibTeX

@inproceedings{coskun2018eccv-human,
  title     = {{Human Motion Analysis with Deep Metric Learning}},
  author    = {Coskun, Huseyin and Joseph Tan, David and Conjeti, Sailesh and Navab, Nassir and Tombari, Federico},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2018},
  doi       = {10.1007/978-3-030-01264-9_41},
  url       = {https://mlanthology.org/eccv/2018/coskun2018eccv-human/}
}