Modular Graph Transformer Networks for Multi-Label Image Classification
Abstract
With the recent advances in graph neural networks, there is a rising number of studies on graph-based multi-label classification with the consideration of object dependencies within visual data. Nevertheless, graph representations can become indistinguishable due to the complex nature of label relationships. We propose a multi-label image classification framework based on graph transformer networks to fully exploit inter-label interactions. The paper presents a modular learning scheme to enhance the classification performance by segregating the computational graph into multiple sub-graphs based on modularity. The proposed approach, named Modular Graph Transformer Networks (MGTN), is capable of employing multiple backbones for better information propagation over different sub-graphs guided by graph transformers and convolutions. We validate our framework on MS-COCO and Fashion550K datasets to demonstrate improvements for multi-label image classification. The source code is available at https://github.com/ReML-AI/MGTN.
Cite
Text
Nguyen et al. "Modular Graph Transformer Networks for Multi-Label Image Classification." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I10.17098Markdown
[Nguyen et al. "Modular Graph Transformer Networks for Multi-Label Image Classification." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/nguyen2021aaai-modular/) doi:10.1609/AAAI.V35I10.17098BibTeX
@inproceedings{nguyen2021aaai-modular,
title = {{Modular Graph Transformer Networks for Multi-Label Image Classification}},
author = {Nguyen, Hoang D. and Vu, Xuan-Son and Le, Duc-Trong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {9092-9100},
doi = {10.1609/AAAI.V35I10.17098},
url = {https://mlanthology.org/aaai/2021/nguyen2021aaai-modular/}
}