Learning Robust Multi-View Representation Using Dual-Masked VAEs

Abstract

Most existing multi-view representation learning methods assume view-completeness and noise-free data. However, such assumptions are strong in real-world applications. Despite advances in methods tailored to view-missing or noise problems individually, a one-size-fits-all approach that concurrently addresses both remains unavailable. To this end, we propose a holistic method, called Dual-masked Variational Autoencoders (DualVAE), which aims at learning robust multi-view representation. The DualVAE exhibits an innovative amalgamation of dual-masked prediction, mixture-of-experts learning, representation disentangling, and a joint loss function in wrapping up all components. The key novelty lies in the dual-masked (view-mask and patch-mask) mechanism to mimic missing views and noisy data. Extensive experiments on four multi-view datasets show the effectiveness of the proposed method and its superior performance in comparison to baselines. The code is available at https://github.com/XLearning-SCU/2025-IJCAI-DualVAE.

Cite

Text

Wang et al. "Learning Robust Multi-View Representation Using Dual-Masked VAEs." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/701

Markdown

[Wang et al. "Learning Robust Multi-View Representation Using Dual-Masked VAEs." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/wang2025ijcai-learning/) doi:10.24963/IJCAI.2025/701

BibTeX

@inproceedings{wang2025ijcai-learning,
  title     = {{Learning Robust Multi-View Representation Using Dual-Masked VAEs}},
  author    = {Wang, Jiedong and Guo, Kai and Hu, Peng and Peng, Xi and Wang, Hao},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {6298-6306},
  doi       = {10.24963/IJCAI.2025/701},
  url       = {https://mlanthology.org/ijcai/2025/wang2025ijcai-learning/}
}