Scene-Aware Label Graph Learning for Multi-Label Image Classification

Abstract

Multi-label image classification refers to assigning a set of labels for an image. One of the main challenges of this task is how to effectively capture the correlation among labels. Existing studies on this issue mostly rely on the statistical label co-occurrence or semantic similarity of labels. However, an important fact is ignored that the co-occurrence of labels is closely related with image scenes (indoor, outdoor, etc.), which is a vital characteristic in multi-label image classification. In this paper, a novel scene-aware label graph learning framework is proposed, which is capable of learning visual representations for labels while fully perceiving their co-occurrence relationships under variable scenes. Specifically, our framework is able to detect scene categories of images without relying on manual annotations, and keeps track of the co-occurring labels by maintaining a global co-occurrence matrix for each scene category throughout the whole training phase. These scene-independent co-occurrence matrices are further employed to guide the interactions among label representations in a graph propagation manner towards accurate label prediction. Extensive experiments on public benchmarks demonstrate the superiority of our proposed framework compared to the state of the arts. Code will be publicly available soon.

Cite

Text

Zhu et al. "Scene-Aware Label Graph Learning for Multi-Label Image Classification." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00142

Markdown

[Zhu et al. "Scene-Aware Label Graph Learning for Multi-Label Image Classification." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/zhu2023iccv-sceneaware/) doi:10.1109/ICCV51070.2023.00142

BibTeX

@inproceedings{zhu2023iccv-sceneaware,
  title     = {{Scene-Aware Label Graph Learning for Multi-Label Image Classification}},
  author    = {Zhu, Xuelin and Liu, Jian and Liu, Weijia and Ge, Jiawei and Liu, Bo and Cao, Jiuxin},
  booktitle = {International Conference on Computer Vision},
  year      = {2023},
  pages     = {1473-1482},
  doi       = {10.1109/ICCV51070.2023.00142},
  url       = {https://mlanthology.org/iccv/2023/zhu2023iccv-sceneaware/}
}