Probabilistic Multimodal Learning with Von Mises-Fisher Distributions

Abstract

Multimodal learning is pivotal for the advancement of artificial intelligence, enabling machines to integrate complementary information from diverse data sources for holistic perception and understanding. Despite significant progress, existing methods struggle with challenges such as noisy inputs, noisy correspondence, and the inherent uncertainty of multimodal data, limiting their reliability and robustness. To address these issues, this paper presents a novel Probabilistic Multimodal Learning framework (PML) that models each data point as a von Mises-Fisher (vMF) distribution, effectively capturing intrinsic uncertainty and enabling robust fusion. Unlike traditional Gaussian-based models, PML learns directional representation with a concentration parameter to quantify reliability directly, enhancing stability and interpretability. To enhance discrimination, we propose a von Mises-Fisher Prototypical Contrastive Learning paradigm (vMF-PCL), which projects data onto a hypersphere by pulling within-class samples closer to their class prototype while pushing between-class prototypes apart, adaptively learning the reliability estimations. Building upon the estimated reliability, we develop a Reliable Multimodal Fusion mechanism (RMF) that dynamically adjusts the contribution and conflict of each modality, ensuring robustness against noisy data, noisy correspondence, and uncertainty. Extensive experiments on nine benchmarks demonstrate the superiority of PML, consistently outperforming 14 state-of-the-art methods. Code is available at https://github.com/XLearning-SCU/2025-IJCAI-PML.

Cite

Text

Hu et al. "Probabilistic Multimodal Learning with Von Mises-Fisher Distributions." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/600

Markdown

[Hu et al. "Probabilistic Multimodal Learning with Von Mises-Fisher Distributions." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/hu2025ijcai-probabilistic/) doi:10.24963/IJCAI.2025/600

BibTeX

@inproceedings{hu2025ijcai-probabilistic,
  title     = {{Probabilistic Multimodal Learning with Von Mises-Fisher Distributions}},
  author    = {Hu, Peng and Qin, Yang and Gou, Yuanbiao and Li, Yunfan and Yang, Mouxing and Peng, Xi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {5390-5398},
  doi       = {10.24963/IJCAI.2025/600},
  url       = {https://mlanthology.org/ijcai/2025/hu2025ijcai-probabilistic/}
}