Hierarchical-Aware Orthogonal Disentanglement Framework for Fine-Grained Skeleton-Based Action Recognition
Abstract
In recent years, skeleton-based action recognition has gained significant attention due to its robustness in varying environmental conditions. However, most existing methods struggle to distinguish fine-grained actions due to subtle motion features, minimal inter-class variation, and they often fail to consider the underlying similarity relationships between action classes. To address these limitations, we propose a Hierarchical-aware Orthogonal Disentanglement framework (HiOD). We disentangle coarse-grained and fine-grained features by employing independent spatial-temporal granularity-aware bases, which encode semantic representations at varying levels of granularity. Additionally, we design a cross-granularity feature interaction mechanism that leverages complementary information between coarse-grained and fine-grained features. We further enhance the learning process through hierarchical prototype contrastive learning, which utilizes the parent class hierarchy to guide the learning of coarse-grained features while ensuring the distinguishability of fine-grained features within child classes. Extensive experiments on FineGYM, FSD-10, NTU RGB+D, and NTU RGB+D 120 datasets demonstrate the superiority of our method in fine-grained action recognition tasks.
Cite
Text
Chang et al. "Hierarchical-Aware Orthogonal Disentanglement Framework for Fine-Grained Skeleton-Based Action Recognition." International Conference on Computer Vision, 2025.Markdown
[Chang et al. "Hierarchical-Aware Orthogonal Disentanglement Framework for Fine-Grained Skeleton-Based Action Recognition." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/chang2025iccv-hierarchicalaware/)BibTeX
@inproceedings{chang2025iccv-hierarchicalaware,
title = {{Hierarchical-Aware Orthogonal Disentanglement Framework for Fine-Grained Skeleton-Based Action Recognition}},
author = {Chang, Haochen and Ren, Pengfei and Zhang, Haoyang and Xie, Liang and Chen, Hongbo and Yin, Erwei},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {11252-11261},
url = {https://mlanthology.org/iccv/2025/chang2025iccv-hierarchicalaware/}
}