GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition

Yao, Ruijie; Jin, Sheng; Xu, Lumin; Zeng, Wang; Liu, Wentao; Qian, Chen; Luo, Ping; Wu, Ji

doi:10.1007/978-3-031-72649-1_6

GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition

Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu

ECCV 2024

doi:10.1007/978-3-031-72649-1_6 /eccv/2024/yao2024eccv-gkgnet/

Abstract

Multi-Label Image Recognition (MLIR) is a challenging task that aims to predict multiple object labels in a single image while modeling the complex relationships between labels and image regions. Although convolutional neural networks and vision transformers have succeeded in processing images as regular grids of pixels or patches, these representations are sub-optimal for capturing irregular and discontinuous regions of interest. In this work, we present the first fully graph convolutional model, Group K-nearest neighbor based Graph convolutional Network (GKGNet), which models the connections between semantic label embeddings and image patches in a flexible and unified graph structure. To address the scale variance of different objects and to capture information from multiple perspectives, we propose the Group KGCN module for dynamic graph construction and message passing. Our experiments demonstrate that GKGNet achieves state-of-the-art performance with significantly lower computational costs on the challenging multi-label datasets, MS-COCO and VOC2007 datasets. Codes are available at https://github.com/jin-s13/GKGNet. : Corresponding authors.

PDF ECCV Semantic Scholar

Cite

Text

Yao et al. "GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72649-1_6

Markdown

[Yao et al. "GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/yao2024eccv-gkgnet/) doi:10.1007/978-3-031-72649-1_6

BibTeX

@inproceedings{yao2024eccv-gkgnet,
  title     = {{GKGNet: Group K-Nearest Neighbor Based Graph Convolutional Network for Multi-Label Image Recognition}},
  author    = {Yao, Ruijie and Jin, Sheng and Xu, Lumin and Zeng, Wang and Liu, Wentao and Qian, Chen and Luo, Ping and Wu, Ji},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72649-1_6},
  url       = {https://mlanthology.org/eccv/2024/yao2024eccv-gkgnet/}
}