Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition

Abstract

Video understanding requires effective modeling of both motion and appearance information, particularly for few-shot action recognition. While recent advances in point tracking have been shown to improve few-shot action recognition, two fundamental challenges persist: selecting informative points to track and effectively modeling their motion patterns. We present Trokens, a novel approach that transforms trajectory points into semantic-aware relational tokens for action recognition. First, we introduce a semantic-aware sampling strategy to adaptively distribute tracking points based on object scale and semantic relevance. Second, we develop a motion modeling framework that captures both intra-trajectory dynamics through the Histogram of Oriented Displacements (HoD) and inter-trajectory relationships to model complex action patterns. Our approach effectively combines these trajectory tokens with semantic features to enhance appearance features with motion information, achieving state-of-the-art performance across six diverse few-shot action recognition benchmarks: Something-Something-V2 (both full and small splits), Kinetics, UCF101, HMDB51, and FineGym.

Cite

Text

Kumar et al. "Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition." International Conference on Computer Vision, 2025.

Markdown

[Kumar et al. "Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/kumar2025iccv-trokens/)

BibTeX

@inproceedings{kumar2025iccv-trokens,
  title     = {{Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition}},
  author    = {Kumar, Pulkit and Huang, Shuaiyi and Walmer, Matthew and Rambhatla, Sai Saketh and Shrivastava, Abhinav},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {13544-13556},
  url       = {https://mlanthology.org/iccv/2025/kumar2025iccv-trokens/}
}