Hacking Task Confounder in Meta-Learning

Abstract

Fine-tuning large language models (LLMs) is difficult due to their huge model size. Recent Fourier domain-based methods show potential for reducing fine-tuning costs. We propose a block circulant matrix-based fine-tuning method with a stable training heuristic to leverage the properties of circulant matrices and one-dimensional Fourier transforms to reduce storage and computation costs. Experiments show that our method uses 14× less number of parameters than VeRA, 16× smaller than LoRA and 32× less FLOPs than FourierFT, while maintaining close or better task performance. Our approach presents a promising way in frequency domain to fine-tune large models on downstream tasks.

Cite

Text

Wang et al. "Hacking Task Confounder in Meta-Learning." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/560

Markdown

[Wang et al. "Hacking Task Confounder in Meta-Learning." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/wang2024ijcai-hacking/) doi:10.24963/ijcai.2024/560

BibTeX

@inproceedings{wang2024ijcai-hacking,
  title     = {{Hacking Task Confounder in Meta-Learning}},
  author    = {Wang, Jingyao and Ren, Yi and Song, Zeen and Zhang, Jianqi and Zheng, Changwen and Qiang, Wenwen},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {5064-5072},
  doi       = {10.24963/ijcai.2024/560},
  url       = {https://mlanthology.org/ijcai/2024/wang2024ijcai-hacking/}
}