S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition

Abstract

Masked self-reconstruction of joints has been shown to be a promising pretext task for self-supervised skeletal action recognition. However, this task focuses on predicting isolated, potentially noisy, joint coordinates, which results in an inefficient utilization of the model capacity. In this paper, we introduce S-JEPA, a Skeleton Joint Embedding Predictive Architecture, which uses a novel pretext task: Given a partial skeleton sequence, predict the latent representations of the missing joints of the same sequence. Such representations serve as abstract prediction targets that direct the modelling power towards learning the high-level context and depth information, instead of unnecessary low-level details. To tackle the potential non-uniformity in these representations, we propose a simple centering operation that is found to benefit training stability, effectively leading to strong off-the-shelf action representations. Extensive experiments show that S-JEPA, combined with the vanilla transformer, outperforms previous state-of-the-art results on NTU60, NTU120, and PKU-MMD datasets. Project website: https://sjepa.github.io.

Cite

Text

Abdelfattah and Alahi. "S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73411-3_21

Markdown

[Abdelfattah and Alahi. "S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/abdelfattah2024eccv-sjepa/) doi:10.1007/978-3-031-73411-3_21

BibTeX

@inproceedings{abdelfattah2024eccv-sjepa,
  title     = {{S-JEPA: A Joint Embedding Predictive Architecture for Skeletal Action Recognition}},
  author    = {Abdelfattah, Mohamed and Alahi, Alexandre},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73411-3_21},
  url       = {https://mlanthology.org/eccv/2024/abdelfattah2024eccv-sjepa/}
}