Supervised Coupled Dictionary Learning with Group Structures for Multi-Modal Retrieval

Abstract

A better similarity mapping function across heterogeneous high-dimensional features is very desirable for many applications involving multi-modal data. In this paper, we introduce coupled dictionary learning (DL) into supervised sparse coding for multi-modal (cross-media) retrieval. We call this Supervised coupled dictionary learning with group structures for Multi-Modal retrieval (SliM2). SliM2 formulates the multi-modal mapping as a constrained dictionary learning problem. By utilizing the intrinsic power of DL to deal with the heterogeneous features, SliM2 extends unimodal DL to multi-modal DL. Moreover, the label information is employed in SliM2 to discover the shared structure inside intra-modality within the same class by a mixed norm (i.e., `l1/l2`-norm). As a result, the multimodal retrieval is conducted via a set of jointly learned mapping functions across multi-modal data. The experimental results show the effectiveness of our proposed model when applied to cross-media retrieval.

Cite

Text

Zhuang et al. "Supervised Coupled Dictionary Learning with Group Structures for Multi-Modal Retrieval." AAAI Conference on Artificial Intelligence, 2013. doi:10.1609/AAAI.V27I1.8603

Markdown

[Zhuang et al. "Supervised Coupled Dictionary Learning with Group Structures for Multi-Modal Retrieval." AAAI Conference on Artificial Intelligence, 2013.](https://mlanthology.org/aaai/2013/zhuang2013aaai-supervised/) doi:10.1609/AAAI.V27I1.8603

BibTeX

@inproceedings{zhuang2013aaai-supervised,
  title     = {{Supervised Coupled Dictionary Learning with Group Structures for Multi-Modal Retrieval}},
  author    = {Zhuang, Yueting and Wang, Yanfei and Wu, Fei and Zhang, Yin and Lu, Weiming},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2013},
  pages     = {1070-1076},
  doi       = {10.1609/AAAI.V27I1.8603},
  url       = {https://mlanthology.org/aaai/2013/zhuang2013aaai-supervised/}
}