Multi-Label Image Classification with Multi-Layered Multi-Perspective Dynamic Semantic Representation

Abstract

With the development of deep learning techniques, multi-label image classification tasks have achieved good performance. Recently, graph convolutional network has been proved to be an effective way to explore the labels dependencies. However, due to the complexity of label semantic relations, the static dependencies obtained by existing methods cannot consider the overall characteristics of an image and accurately locate the target region. Therefore, we propose the Multi-layered Multi-perspective Dynamic Semantic Representation (MMDSR) for multi-label image classification, which mainly includes three important modules: (1) multi-scale feature reconstruction, which aggregates complementary information at different levels in convolutional neural network through cross-layer attention, it can effectively identify target categories of different sizes; (2) channel dual-branch cross-attention module, we propose to explore the correlation between global information and local features in multi-scale features by the way of adaptive cross-fusion to locate the target area more accurately; (3) dynamic semantic representation module, we design the multi-perspective weighted cosine measure to construct content-based label dependencies for each image to dynamically construct a semantic relationship graph. Extensive experiments on the two datasets MS-COCO and VOC2007 have verified that the classification performance of our proposed MMDSR is better than many state-of-the-art methods.

Cite

Text

Kuang and Li. "Multi-Label Image Classification with Multi-Layered Multi-Perspective Dynamic Semantic Representation." Machine Learning, 2024. doi:10.1007/S10994-023-06440-8

Markdown

[Kuang and Li. "Multi-Label Image Classification with Multi-Layered Multi-Perspective Dynamic Semantic Representation." Machine Learning, 2024.](https://mlanthology.org/mlj/2024/kuang2024mlj-multilabel/) doi:10.1007/S10994-023-06440-8

BibTeX

@article{kuang2024mlj-multilabel,
  title     = {{Multi-Label Image Classification with Multi-Layered Multi-Perspective Dynamic Semantic Representation}},
  author    = {Kuang, Wenlan and Li, Zhixin},
  journal   = {Machine Learning},
  year      = {2024},
  pages     = {3443-3461},
  doi       = {10.1007/S10994-023-06440-8},
  volume    = {113},
  url       = {https://mlanthology.org/mlj/2024/kuang2024mlj-multilabel/}
}