Train Your Cake and Eat It Too! Repurposing Collaborative Training to Tailor LLMs to Private Data Without Sharing

Abstract

In the emerging field of large language models (LLMs), a significant challenge arises when organizations with vast datasets lack the computational resources to independently train and fine-tune models. This issue stems from privacy, compliance, and resource constraints: organizations cannot share their sensitive data but still need external computational assistance for model training. In this paper, we implement, enhance, and empirically compare several methods, including Split Learning (SL) and select Federated Learning (FL) methods, which enable data-rich yet compute-poor clients to offload LLM training without sharing raw data. Our study evaluates these methods across multiple dimensions, including model quality and training time.

Cite

Text

Radovič et al. "Train Your Cake and Eat It Too! Repurposing Collaborative Training to Tailor LLMs to Private Data Without Sharing." ICML 2024 Workshops: ES-FoMo-II, 2024.

Markdown

[Radovič et al. "Train Your Cake and Eat It Too! Repurposing Collaborative Training to Tailor LLMs to Private Data Without Sharing." ICML 2024 Workshops: ES-FoMo-II, 2024.](https://mlanthology.org/icmlw/2024/radovic2024icmlw-train/)

BibTeX

@inproceedings{radovic2024icmlw-train,
  title     = {{Train Your Cake and Eat It Too! Repurposing Collaborative Training to Tailor LLMs to Private Data Without Sharing}},
  author    = {Radovič, Boris and Aljahdali, Mohammed and Canini, Marco and Pejović, Veljko and Khayyat, Zuhair},
  booktitle = {ICML 2024 Workshops: ES-FoMo-II},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/radovic2024icmlw-train/}
}