$\text{Transformer}^2$: Self-Adaptive LLMs
Abstract
Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and inflexible for diverse tasks. We introduce $\text{Transformer}^2$, a novel framework that adapts LLMs for unseen tasks in real-time by selectively adjusting singular components of weight matrices, using a two-pass mechanism: task identification followed by mixing of task-specific "expert" vectors to best cope with test-time conditions. Our approach outperforms ubiquitous methods like LoRA with fewer parameters and greater efficiency across various LLM architectures and modalities, and offers a scalable solution for enhancing the adaptability and task-specific performance of LLMs, paving the way for truly self-organizing AI systems.
Cite
Text
Sun et al. "$\text{Transformer}^2$: Self-Adaptive LLMs." NeurIPS 2024 Workshops: AFM, 2024.Markdown
[Sun et al. "$\text{Transformer}^2$: Self-Adaptive LLMs." NeurIPS 2024 Workshops: AFM, 2024.](https://mlanthology.org/neuripsw/2024/sun2024neuripsw-transformer/)BibTeX
@inproceedings{sun2024neuripsw-transformer,
title = {{$\text{Transformer}^2$: Self-Adaptive LLMs}},
author = {Sun, Qi and Cetin, Edoardo and Tang, Yujin},
booktitle = {NeurIPS 2024 Workshops: AFM},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/sun2024neuripsw-transformer/}
}