MoSA: Mosaic Shared Adaptation of Large Language Models
Abstract
We introduce MoSA, a new parameter-efficient fine-tuning (PEFT) method that replaces low-rank factorization with randomized, fine-grained sharing of weight updates. Each adapted weight matrix is constructed by broadcasting a small set of learned scalars over a fixed tessellation, a pre-defined group assignment of weight entries of the weight matrix, producing expressive changes under the same parameter budget as low-rank adaptation (LoRA). MoSA requires no architectural changes and can be merged into the base model for zero-overhead inference. Across diverse language understanding and generation tasks, MoSA matches or surpasses strong PEFT baselines under strictly matched budgets. Analyses and ablations indicate that non-local parameter sharing acts as an effective regularizer, and that grouping design and budget allocation govern the expressivity–efficiency trade-off. These results position MoSA as a simple, scalable alternative to LoRA. Our code is available at https://github.com/XiequnWang/MoSA-ICLR26.
Cite
Text
Wang et al. "MoSA: Mosaic Shared Adaptation of Large Language Models." International Conference on Learning Representations, 2026.Markdown
[Wang et al. "MoSA: Mosaic Shared Adaptation of Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wang2026iclr-mosa/)BibTeX
@inproceedings{wang2026iclr-mosa,
title = {{MoSA: Mosaic Shared Adaptation of Large Language Models}},
author = {Wang, Xiequn and Zhuang, Zhan and Luo, Shengda and Zhang, Yu},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/wang2026iclr-mosa/}
}