Interactive Multimodal Learning via Flat Gradient Modification
Abstract
Due to the notorious modality imbalance phenomenon, multimodal learning (MML) struggles to achieve satisfactory performance. Recently, multimodal learning with alternating unimodal adaptation (MLA) has been proven effective in mitigating the interference between modalities by capturing interaction through orthogonal projection, thus relieving modality imbalance phenomenon to some extent. However, the projection strategy orthogonal to the original space can lead to poor plasticity as the alternating learning proceeds, thus affecting model performance. To address this issue, in this paper, we propose a novel multimodal learning method called interactiveMML via flat gradient modification (IGM) by employing a flat gradient modification strategy to enhance interactive MML. Specifically, we first employ a flat projection-based gradient modification strategy that is independent to the original space, aiming to avoid the poor plasticity issue. Then we introduce the sharpness-aware minimization (SAM)-based optimization strategy to fully exploit the flatness of the learning objective and further enhance interaction during learning. To this end, the plasticity problem can be avoided and the overall performance is improved. Extensive experiments on widely used datasets demonstrate that IGM outperforms various state-of-the-art (SOTA) baselines, achieving superior performance. The source code is available at https://anonymous.4open.science/r/method-CC45.
Cite
Text
Jiang et al. "Interactive Multimodal Learning via Flat Gradient Modification." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/611Markdown
[Jiang et al. "Interactive Multimodal Learning via Flat Gradient Modification." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/jiang2025ijcai-interactive/) doi:10.24963/IJCAI.2025/611BibTeX
@inproceedings{jiang2025ijcai-interactive,
title = {{Interactive Multimodal Learning via Flat Gradient Modification}},
author = {Jiang, Qing-Yuan and Chi, Zhouyang and Yang, Yang},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2025},
pages = {5489-5497},
doi = {10.24963/IJCAI.2025/611},
url = {https://mlanthology.org/ijcai/2025/jiang2025ijcai-interactive/}
}