$\text{Transformer}^2$: Self-Adaptive LLMs

Abstract

Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and inflexible for diverse tasks. We introduce $\text{Transformer}^2$, a novel framework that adapts LLMs for unseen tasks in real-time by selectively adjusting singular components of weight matrices, using a two-pass mechanism: task identification followed by mixing of task-specific "expert" vectors to best cope with test-time conditions. Our approach outperforms ubiquitous methods like LoRA with fewer parameters and greater efficiency across various LLM architectures and modalities, and offers a scalable solution for enhancing the adaptability and task-specific performance of LLMs, paving the way for truly self-organizing AI systems.

Cite

Text

Sun et al. "$\text{Transformer}^2$: Self-Adaptive LLMs." NeurIPS 2024 Workshops: AFM, 2024.

Markdown

[Sun et al. "$\text{Transformer}^2$: Self-Adaptive LLMs." NeurIPS 2024 Workshops: AFM, 2024.](https://mlanthology.org/neuripsw/2024/sun2024neuripsw-transformer/)

BibTeX

@inproceedings{sun2024neuripsw-transformer,
  title     = {{$\text{Transformer}^2$: Self-Adaptive LLMs}},
  author    = {Sun, Qi and Cetin, Edoardo and Tang, Yujin},
  booktitle = {NeurIPS 2024 Workshops: AFM},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/sun2024neuripsw-transformer/}
}