MoSA: Mosaic Shared Adaptation of Large Language Models

Abstract

We introduce MoSA, a new parameter-efficient fine-tuning (PEFT) method that replaces low-rank factorization with randomized, fine-grained sharing of weight updates. Each adapted weight matrix is constructed by broadcasting a small set of learned scalars over a fixed tessellation, a pre-defined group assignment of weight entries of the weight matrix, producing expressive changes under the same parameter budget as low-rank adaptation (LoRA). MoSA requires no architectural changes and can be merged into the base model for zero-overhead inference. Across diverse language understanding and generation tasks, MoSA matches or surpasses strong PEFT baselines under strictly matched budgets. Analyses and ablations indicate that non-local parameter sharing acts as an effective regularizer, and that grouping design and budget allocation govern the expressivity–efficiency trade-off. These results position MoSA as a simple, scalable alternative to LoRA. Our code is available at https://github.com/XiequnWang/MoSA-ICLR26.

Cite

Text

Wang et al. "MoSA: Mosaic Shared Adaptation of Large Language Models." International Conference on Learning Representations, 2026.

Markdown

[Wang et al. "MoSA: Mosaic Shared Adaptation of Large Language Models." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/wang2026iclr-mosa/)

BibTeX

@inproceedings{wang2026iclr-mosa,
  title     = {{MoSA: Mosaic Shared Adaptation of Large Language Models}},
  author    = {Wang, Xiequn and Zhuang, Zhan and Luo, Shengda and Zhang, Yu},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/wang2026iclr-mosa/}
}