Composing Linear Layers from Irreducibles
Abstract
Contemporary large models often exhibit behaviors suggesting the presence of low-level primitives that compose into modules with richer functionality, but these fundamental building blocks remain poorly understood. We investigate this compositional structure in linear layers by asking: \textit{can we identify/synthesize linear transformations from a minimal set of geometric primitives?} Using Clifford algebra, we show that linear layers can be expressed as compositions of bivectors---geometric objects encoding oriented planes---and introduce a differentiable algorithm that decomposes them into products of rotors. This construction uses only $\mathcal{O}(\log^2 d)$ parameters, versus $\mathcal{O}(d^2)$ required by dense matrices. Applied to the key, query, and value projections in LLM attention layers, our rotor-based layers match the performance of strong baselines such as block-Hadamard and low-rank approximations. Our findings provide an algebraic perspective on how these geometric primitives can compose into higher-level functions within deep models.
Cite
Text
Pence et al. "Composing Linear Layers from Irreducibles." Advances in Neural Information Processing Systems, 2025.Markdown
[Pence et al. "Composing Linear Layers from Irreducibles." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/pence2025neurips-composing/)BibTeX
@inproceedings{pence2025neurips-composing,
title = {{Composing Linear Layers from Irreducibles}},
author = {Pence, Travis and Yamada, Daisuke and Singh, Vikas},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
url = {https://mlanthology.org/neurips/2025/pence2025neurips-composing/}
}