Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM

Abstract

We present a preliminary investigation into the outlier problem within the multi-modal foundation model with a focus on SpeechLM. Specifically, we consider SpeechLM models that employ a pretrained LM as the backbone and are fine-tuned on multi-modal data (speech and text). There is an outlier problem in pretrained LLMs and the multi-modal inputs in SpeechLM. By adopting a principled approach inspired by associative memory models to address the outlier problem, we achieve significant improvements in the following: Faster low-rank adaptation, More accurate cross-modal fine-tuning, More robust post-training quantization. Methodologically, we implement an outlier-efficient Hopfield layer to replace the conventional transformer attention mechanism. This adjustment effectively removes outliers, leading to the improvement of the performance in multi-modal adaption and inference with quantized model. As a result, our proposed framework yields an average performance improvement of 7.98\% in cross-modal fine-tuning and 67.85\% in quantization, significantly outperforming standard frameworks in these respects.

Cite

Text

Wu et al. "Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM." ICML 2024 Workshops: ES-FoMo-II, 2024.

Markdown

[Wu et al. "Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/wu2024icmlw-fast/)

BibTeX

@inproceedings{wu2024icmlw-fast,
  title     = {{Fast Adaptation and Robust Quantization of Multi-Modal Foundation Models from Associative Memory: A Case Study in SpeechLM}},
  author    = {Wu, Shang and Lu, Yen-Ju and Luo, Haozheng and Hu, Jerry Yao-Chieh and Wang, Jiayi and Dehak, Najim and Villalba, Jesus and Liu, Han},
  booktitle = {ICML 2024 Workshops: ES-FoMo-II},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/wu2024icmlw-fast/}
}