Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering

Abstract

Multi-view clustering, a long-standing and important research problem, focuses on mining complementary information from diverse views. However, existing works often fuse multiple views' representations or handle clustering in a common feature space, which may result in their entanglement especially for visual representations. To address this issue, we present a novel VAE-based multi-view clustering framework (Multi-VAE) by learning disentangled visual representations. Concretely, we define a view-common variable and multiple view-peculiar variables in the generative model. The prior of view-common variable obeys approximately discrete Gumbel Softmax distribution, which is introduced to extract the common cluster factor of multiple views. Meanwhile, the prior of view-peculiar variable follows continuous Gaussian distribution, which is used to represent each view's peculiar visual factors. By controlling the mutual information capacity to disentangle the view-common and view-peculiar representations, continuous visual information of multiple views can be separated so that their common discrete cluster information can be effectively mined. Experimental results demonstrate that Multi-VAE enjoys the disentangled and explainable visual representations, while obtaining superior clustering performance compared with state-of-the-art methods.

Cite

Text

Xu et al. "Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering." International Conference on Computer Vision, 2021. doi:10.1109/ICCV48922.2021.00910

Markdown

[Xu et al. "Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering." International Conference on Computer Vision, 2021.](https://mlanthology.org/iccv/2021/xu2021iccv-multivae/) doi:10.1109/ICCV48922.2021.00910

BibTeX

@inproceedings{xu2021iccv-multivae,
  title     = {{Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering}},
  author    = {Xu, Jie and Ren, Yazhou and Tang, Huayi and Pu, Xiaorong and Zhu, Xiaofeng and Zeng, Ming and He, Lifang},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
  pages     = {9234-9243},
  doi       = {10.1109/ICCV48922.2021.00910},
  url       = {https://mlanthology.org/iccv/2021/xu2021iccv-multivae/}
}