HOIMamba: Efficient Mamba-Based Disentangled Progressive Learning for HOI Detection
Abstract
Human-object interaction (HOI) detection aims to detect the spatial positions of human-object pairs and recognize their interactions. Existing single-branch, two-branch, and three-branch methods are challenging to make an appropriate trade-off on efficiency, multi-task decoupling, and collaborative learning, while they fail to identify rare and complex interaction categories effectively as well. In this work, we propose a novel Efficient Mamba-based Disentangled Progressive Learning (HOIMamba) for HOI Detection to absorb the advantages of the existing three approaches and adaptively aggregate multi-level interaction semantics guided by cross-task bidirectional information contexts. Specifically, HOIMamba builds an efficient and effective decoder through cascaded Low-Rank Adaptations (LoRAs), with high efficiency, thorough decoupling of tasks, and good multi-task collaborative learning. Furthermore, to alleviate the recognition problem of interactions in difficult HOI samples, a novel Mamba-based comprehensive progressive learning strategy with Cross-enhance Mamba (CEM) blocks and Detection Context Propagation (DCP) blocks is designed to gradually excavate interaction-related discriminative cues from four levels. CEM blocks automatically aggregate context to generate diverse task-shared semantics and simultaneously realize the cross-task interaction between human and object branches, guiding the interaction branch to extract more expressive HOI representation. DCP blocks further transfer the comprehensive interaction context to human and object branches to achieve rich and effective information exchange, facilitating the model to discover more HOI instances. Extensive experimental results on two standard benchmarks demonstrate the effectiveness of our HOIMamba.
Cite
Text
Xu et al. "HOIMamba: Efficient Mamba-Based Disentangled Progressive Learning for HOI Detection." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I9.32972Markdown
[Xu et al. "HOIMamba: Efficient Mamba-Based Disentangled Progressive Learning for HOI Detection." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/xu2025aaai-hoimamba/) doi:10.1609/AAAI.V39I9.32972BibTeX
@inproceedings{xu2025aaai-hoimamba,
title = {{HOIMamba: Efficient Mamba-Based Disentangled Progressive Learning for HOI Detection}},
author = {Xu, Yongchao and Liu, Jiawei and Tao, Sen and Zhang, Qiang and Zha, Zheng-Jun},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {8987-8995},
doi = {10.1609/AAAI.V39I9.32972},
url = {https://mlanthology.org/aaai/2025/xu2025aaai-hoimamba/}
}