Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM
Abstract
We present a preliminary investigation into the outlier problem within the multi-modal foundation model with a focus on SpeechLM. Specifically, we consider SpeechLM models that employ a pretrained LM as the backbone and are fine-tuned on multi-modal data (speech and text). There is an outlier problem in pretrained LLMs and the multi-modal inputs in SpeechLM. By adopting a principled approach inspired by associative memory models to address the outlier problem, we achieve significant improvements in the following: Faster low-rank adaptation, More accurate cross-modal fine-tuning, More robust post-training quantization. Methodologically, we implement an outlier-efficient Hopfield layer to replace the conventional transformer attention mechanism. This adjustment effectively removes outliers, leading to the improvement of the performance in multi-modal adaption and inference with quantized model. As a result, our proposed framework yields an average performance improvement of 7.98\% in cross-modal fine-tuning and 67.85\% in quantization, significantly outperforming standard frameworks in these respects.
Cite
Text
Wu et al. "Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM." ICML 2024 Workshops: ES-FoMo-II, 2024.Markdown
[Wu et al. "Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/wu2024icmlw-fast/)BibTeX
@inproceedings{wu2024icmlw-fast,
title = {{Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM}},
author = {Wu, Shang and Lu, Yen-Ju and Luo, Haozheng and Hu, Jerry Yao-Chieh and Wang, Jiayi and Dehak, Najim and Villalba, Jesus and Liu, Han},
booktitle = {ICML 2024 Workshops: ES-FoMo-II},
year = {2024},
url = {https://mlanthology.org/icmlw/2024/wu2024icmlw-fast/}
}