Bidirectional Contrastive Split Learning for Visual Question Answering

Abstract

Visual Question Answering (VQA) based on multi-modal data facilitates real-life applications such as home robots and medical diagnoses. One significant challenge is to devise a robust decentralized learning framework for various client models where centralized data collection is refrained due to confidentiality concerns. This work aims to tackle privacy-preserving VQA by decoupling a multi-modal model into representation modules and a contrastive module, leveraging inter-module gradients sharing and inter-client weight sharing. To this end, we propose Bidirectional Contrastive Split Learning (BiCSL) to train a global multi-modal model on the entire data distribution of decentralized clients. We employ the contrastive loss that enables a more efficient self-supervised learning of decentralized modules. Comprehensive experiments are conducted on the VQA-v2 dataset based on five SOTA VQA models, demonstrating the effectiveness of the proposed method. Furthermore, we inspect BiCSL's robustness against a dual-key backdoor attack on VQA. Consequently, BiCSL shows significantly enhanced resilience when exposed to the multi-modal adversarial attack compared to the centralized learning method, which provides a promising approach to decentralized multi-modal learning.

Cite

Text

Sun and Ochiai. "Bidirectional Contrastive Split Learning for Visual Question Answering." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I19.30158

Markdown

[Sun and Ochiai. "Bidirectional Contrastive Split Learning for Visual Question Answering." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/sun2024aaai-bidirectional/) doi:10.1609/AAAI.V38I19.30158

BibTeX

@inproceedings{sun2024aaai-bidirectional,
  title     = {{Bidirectional Contrastive Split Learning for Visual Question Answering}},
  author    = {Sun, Yuwei and Ochiai, Hideya},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {21602-21609},
  doi       = {10.1609/AAAI.V38I19.30158},
  url       = {https://mlanthology.org/aaai/2024/sun2024aaai-bidirectional/}
}