TANDEM: Bi-Level Data Mixture Optimization with Twin Networks

Abstract

The capabilities of large language models (LLMs) significantly depend on training data drawn from various domains. Optimizing domain-specific mixture ratios can be modeled as a bi-level optimization problem, which we simplify into a single-level penalized form and solve with twin networks: a proxy model trained on primary data and a dynamically updated reference model trained with additional data. Our proposed method, Twin Networks for bi-level DatA mixturE optiMization (TANDEM), measures the data efficacy through the difference between the twin models and up-weights domains that benefit more from the additional data. TANDEM provides theoretical guarantees and wider applicability, compared to prior approaches. Furthermore, our bi-level perspective suggests new settings to study domain reweighting such as data-restricted scenarios and supervised fine-tuning, where optimized mixture ratios significantly improve the performance. Extensive experiments validate TANDEM's effectiveness in all scenarios.

Cite

Text

Wang et al. "TANDEM: Bi-Level Data Mixture Optimization with Twin Networks." Advances in Neural Information Processing Systems, 2025.

Markdown

[Wang et al. "TANDEM: Bi-Level Data Mixture Optimization with Twin Networks." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/wang2025neurips-tandem/)

BibTeX

@inproceedings{wang2025neurips-tandem,
  title     = {{TANDEM: Bi-Level Data Mixture Optimization with Twin Networks}},
  author    = {Wang, Jiaxing and Xiang, Deping and Xu, Jin and Yi, Mingyang and Gong, Guoqiang and Zhang, Zicheng and Li, Haoran and Liu, Pengzhang and Chen, Zhen and Zhang, Ke and Fan, Ju and Jiang, Qixia},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/wang2025neurips-tandem/}
}