Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs
Abstract
This paper focuses on the high computational complexity in Large Language Models (LLMs) a significant challenge in both natural language processing (NLP) and multi-modal tasks. We propose Low-Rank Approximation for Sparse At- tention (LoRA-Sparse) an innovative approach that strate- gically reduces this complexity. LoRA-Sparse introduces low-rank linear projection layers for sparse attention ap- proximation. It utilizes an order-mimic training methodol- ogy which is crucial for efficiently approximating the self- attention mechanism in LLMs. We empirically show that sparse attention not only reduces computational demands but also enhances model performance in both NLP and multi-modal tasks. This surprisingly shows that redundant attention in LLMs might be non-beneficial. We extensively validate LoRA-Sparse through rigorous empirical studies in both (NLP) and multi-modal tasks demonstrating its effec- tiveness and general applicability. Based on LLaMA and LLaVA models our methods can reduce more than half of the self-attention computation with even better performance than full-attention baselines.
Cite
Text
Song et al. "Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.01306Markdown
[Song et al. "Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/song2024cvpr-lowrank/) doi:10.1109/CVPR52733.2024.01306BibTeX
@inproceedings{song2024cvpr-lowrank,
title = {{Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs}},
author = {Song, Lin and Chen, Yukang and Yang, Shuai and Ding, Xiaohan and Ge, Yixiao and Chen, Ying-Cong and Shan, Ying},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {13763-13773},
doi = {10.1109/CVPR52733.2024.01306},
url = {https://mlanthology.org/cvpr/2024/song2024cvpr-lowrank/}
}