Learning Attentive and Hierarchical Representations for 3D Shape Recognition

Abstract

This paper proposes a novel method for 3D shape representation learning, namely Hyperbolic Embedded Attentive Representation (HEAR). Different from existing multi-view based methods, HEAR develops a unified framework to address both multi-view redundancy and single-view incompleteness. Specifically, HEAR firstly employs a hybrid attention (HA) module, which consists of a view-agnostic attention (VAA) block and a view-specific attention (VSA) block. These two blocks jointly explore distinct but complementary spatial saliency of local features for each single-view image. Subsequently, a multi-granular view pooling (MVP) module is introduced to aggregate the multi-view features with different granularities in a coarse-to-fine manner. The resulting feature set implicitly has hierarchical relations, which are therefore projected into a Hyperbolic space by adopting the Hyperbolic embedding. A hierarchical representation is learned by Hyperbolic multi-class logistic regression based on the Hyperbolic geometry. Experimental results clearly show that HEAR outperforms the state-of-the-art approaches on three 3D shape recognition tasks including generic 3D shape retrieval, 3D shape classification and sketch-based 3D shape retrieval.

Cite

Text

Chen et al. "Learning Attentive and Hierarchical Representations for 3D Shape Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58555-6_7

Markdown

[Chen et al. "Learning Attentive and Hierarchical Representations for 3D Shape Recognition." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/chen2020eccv-learning-b/) doi:10.1007/978-3-030-58555-6_7

BibTeX

@inproceedings{chen2020eccv-learning-b,
  title     = {{Learning Attentive and Hierarchical Representations for 3D Shape Recognition}},
  author    = {Chen, Jiaxin and Qin, Jie and Shen, Yuming and Liu, Li and Zhu, Fan and Shao, Ling},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58555-6_7},
  url       = {https://mlanthology.org/eccv/2020/chen2020eccv-learning-b/}
}