Hacking Task Confounder in Meta-Learning
Abstract
Fine-tuning large language models (LLMs) is difficult due to their huge model size. Recent Fourier domain-based methods show potential for reducing fine-tuning costs. We propose a block circulant matrix-based fine-tuning method with a stable training heuristic to leverage the properties of circulant matrices and one-dimensional Fourier transforms to reduce storage and computation costs. Experiments show that our method uses 14× less number of parameters than VeRA, 16× smaller than LoRA and 32× less FLOPs than FourierFT, while maintaining close or better task performance. Our approach presents a promising way in frequency domain to fine-tune large models on downstream tasks.
Cite
Text
Wang et al. "Hacking Task Confounder in Meta-Learning." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/560Markdown
[Wang et al. "Hacking Task Confounder in Meta-Learning." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/wang2024ijcai-hacking/) doi:10.24963/ijcai.2024/560BibTeX
@inproceedings{wang2024ijcai-hacking,
title = {{Hacking Task Confounder in Meta-Learning}},
author = {Wang, Jingyao and Ren, Yi and Song, Zeen and Zhang, Jianqi and Zheng, Changwen and Qiang, Wenwen},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2024},
pages = {5064-5072},
doi = {10.24963/ijcai.2024/560},
url = {https://mlanthology.org/ijcai/2024/wang2024ijcai-hacking/}
}