Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding

Abstract

This paper pays attention to Weakly Supervised Affordance Grounding (WSAG) task that aims to train model to identify affordance regions using human-object interaction images and egocentric images without the need for costly pixel-level annotations. Most existing methods usually consider the affordance regions to be isolated and directly employ class activation maps to conduct localization, ignoring the relationships with other object components and weakening the performance. For example, a cup's handle is combined with its body to achieve the pouring ability. Obviously, capturing the region relationships is beneficial for improving the localization accuracy of affordance regions. To this end, we first explore exploiting hypergraph to discover these relations and propose a Reasoning Mamba (R-Mamba) framework. We first extract feature embeddings from exo-centric and egocentric images to construct the hypergraphs consisting of multiple vertices and hyperedges, which capture the in-context local region relationships between different visual components. Subsequently, we design a Hypergraph-guided State Space (HSS) block to reorganize these local relationships from the global perspective. By this mechanism, the model could leverage the captured relationships to improve the localization accuracy of affordance regions. Extensive experiments and visualization analyses demonstrate the superiority of our method.

Cite

Text

Wang et al. "Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.02572

Markdown

[Wang et al. "Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/wang2025cvpr-reasoning/) doi:10.1109/CVPR52734.2025.02572

BibTeX

@inproceedings{wang2025cvpr-reasoning,
  title     = {{Reasoning Mamba: Hypergraph-Guided Region Relation Calculating for Weakly Supervised Affordance Grounding}},
  author    = {Wang, Yuxuan and Wu, Aming and Yang, Muli and Min, Yukuan and Zhu, Yihang and Deng, Cheng},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {27618-27627},
  doi       = {10.1109/CVPR52734.2025.02572},
  url       = {https://mlanthology.org/cvpr/2025/wang2025cvpr-reasoning/}
}